Personal Data in the Cloud - codecentric AG Blog
The topic personal data in the cloud has at least two perspectives. There is the end user perspective where we as users ask ourselves whether our data in the cloud are safe. And there is the enterprise perspective. Suppose you want to start a cloud based service and store personal data of end users in the cloud. What are the issues to be taken care of here? In this article we take the enterprise perspective. Of course we can hardly answer the above question in full. We rather focus on some legal aspects in cases where your company resides in the EU.
One of the key ideas behind cloud computing is that end users do not need to care where their data reside. Servers hosting these data may be located anywhere on the globe. Since cloud computing saw its strongest early development in the US it is a fact that most companies offering cloud services are US based and many data centers are located in the US. We currently see more and more date centers being opened in Asia and Europe as well.
If data being stored on these servers are personal data, there may be an issue because there is no uniform agreement about how personal data may be used in different countries around the world. There may not even be a uniform understanding about the term personal data. And that is in fact the case. The European Union is known to have the strictest regulations about how personal data may be handled by authorities, administrations, and private companies. The base for this is the data protection directive of the EU.
Part of this directive is the ban to export any personal data to countries or organisations that do not offer the same level of data protection as is installed in the EU by this directive. So how can we ensure that we do not violate the directive when storing personal data in the cloud? To answer the question we first have to understand what personal data are.
The EU directive deliberately chooses to take a wide notion of the term personal data. According to the directive,
Personal data are defined as “any information relating to an identified or identifiable natural person (“data subject”); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity;” (art. 2 a). Typical examples are addresses, bank account or credit card information, medical records, criminal records, company employee data and the like.
The use of personal data is governed by the following three principles: transparency, legitimate purpose, and proportionality. Transparency basically requires that the persons whose data are used have to give their consent for doing so and know what is used, when it is used and for what purpose. Legitimate purpose means that the data collected may only be used for the purpose the persons agreed to and not for any other purposes. Proportionality requires that only those data are collected and processed that are needed to perform the legitimate task and that these data are kept up to date and correct.
Now that we understand the notion of personal data a bit better let’s look at some solutions to the data protection problem when storing data in the cloud.
No Solution: The Safe Harbor Agreement
When introducing the directive, the European Commission understood that there has been a regular exchange of personal data between EU member states and third countries, in particular the US. Therefore the so-called Safe Harbor data privacy program was initiated. The program basically allows US companies to audit themselves to adhere to the EU data protection directive and self-certify this adherence. A couple of reviews initiated by the EU to state the quality of this self-certification process revealed that the quality of the certification is extremely poor. The last survey, performed in 2008, showed that out of the then up-to-date list of about 1,600 US companies claiming their compliance only as little as 3% were in fact compliant. So, we cannot rely on such a certificate.
The situation for German companies that intend to store data in a cloud is even worse. The reason for this is that the German authorities decided to draw consequences of the review results. In particular the Düsseldorfer Kreis, the convention of Germany’s governmental data protection agencies, declared in April 2010 that data exporting companies cannot rely on Safe Harbor certificates but rather have to check with the companies receiving the data that the receiving companies comply to the directive. In effect this basically means that the exporting company has to perform and document the auditing of the compliance of the data receiving company. In other words it is valid to assume that the exporting company takes legal responsibility for the correct storage and use of personal data on the side of the importing company. This is plainly impossible to perform. How should a mid size company in Germany audit and check an internet giant like Microsoft or Google, which have legal departments that are bigger in size than the total number of employees in the German company? So we see here that
- We can’t consider the Safe Harbor agreement as a solution to the original problem.
- Neither does it seem a wise move to transfer any personal data to companies which certified themselves as such a Safe Harbor.
Simple Solution: Stay at Home
A number of US based cloud service providers have understood that EU based companies have problems to export personal data. In order to foster their business in the EU they decided to open data centers in the EU. Companies like Amazon, Google, and Microsoft nowadays offer their customers to choose the geographical location of the data center(s) where their data should reside. This is more or less the simplest solution. Some companies including Amazon, Google, and Microsoft even run more than one data centre in the EU in case one is in need of a geographically remote backup centre.
Technical Solution: Depersonalise your Data
There may yet be another solution in cases where the cloud service provider does not operate data centers in the EU. The EU directive deliberately refrained from defining an explicit catalogue of what counts as personal data and rather defined personal data as any set of data that can be used to identify an individual person. This abstract definition allows for a technical solution to the problem. The technical solution consists in using data encryption. The following conditions have to be fulfilled for a valid solution.
- All personal data have to be encrypted.
- The encryption keys have to be individual to each customer.
A general solution where the cloud service provider encrypts all stored data with one and the same key for all customers does not provide sufficient protection.
- The customer and no one else must be capable of decrypting the data.
- The decryption keys have to reside within the European Union.
Methods following these guidelines effectively depersonalise the data. Now, that the data are not personal any more, they are no longer subject to the EU data protection directive and may hence be stored at any data center around the world.
Since the Safe Harbor agreement should currently be regarded as a failure, the EU data protection directive basically prohibits the export of personal data out of the EU. As a consequence care has to be taken when thinking about transferring personal data into the cloud, because the principle of data storage virtualisation, a key concept of the cloud, stands in direct contrast with the data export ban.
Fortunately there are two solutions to this problem which allow us to adhere to the data protection directive and still use the cloud for the storage of personal data. One consists in choosing a provider that offers customers to store their data in data centers within the EU. The second consists in depersonalising the data so that it may legally be exported anywhere.