Background and aim of the PSI directive and its revision
The aim of the PSI Directive on re-use of Public Sector Information is to foster and guide the re-use of Public Sector Information (PSI). The directive advocates making more public sector data and public research data available at higher quality to increase the reusability. A solid framework guides which data needs to be open and how potential costs can be calculated. In order to update the directive to fit the development in PSI re-use, the European Commission adopted a proposal for a revised PSI Directive. The proposal aims at overcoming the challenges and barriers that still hinder the full potential of re-use of PSI. One important challenge that hinders PSI re-use is doubt and misunderstanding about data protection regulation and opening PSI.
Protecting and opening PSI
As published by the European Data Portal on 6 July, protecting data and opening data goes hand in hand. The new General Data Protection Regulation (Regulation (EU) 2016/679, which replaces Directive 95/46/EU) supports publishing PSI in compliance with data protection laws. However, there are still concerns and challenges around opening of Public Sector Information in case it contains personal data.
In doubt if PSI contains personal data, the simplest choice for many public bodies is to keep PSI locked. The potential fines stated in the GDPR can further prevent the opening of PSI if a qualification as personal data cannot be categorically excluded. Therefore, a revised PSI directive that aims at further fostering PSI re-use also has to address these concerns and doubts. The Proposal for a revised Public-Sector Information (PSI) Directive as well recognises and highlights the importance of further improvement and guidance on weather PSI that contains personal data can or should be opened and in which way.
The challenge of opening PSI that contains personal data
One challenge in making PSI available and re-usable occurs whenever PSI contains personal data. Personal data is "any information relating to an identified or identifiable natural person (data subject) an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, (...)".
In cases where PSI contains personal data, the primacy principle of data protection comes into play. It states that any PSI law has to be applied in coherence with data protection law and cannot create exceptions, as the protection of personal data is recognised as a fundamental right in article 8 of the European Convention on Human Rights.
Protection of Data under the PSI directive
This primacy principle is explicitly recognised by the PSI Directive. In other terms, EU member states and PSI re-users must consider the principles and obligations of data protection law when applying or implementing the PSI Directive. This does not imply that PSI that contains personal data cannot be opened, it rather demands a thorough assessment under which conditions the opening is lawful.
The PSI Directive triple assessment
In order to support the opening of PSI while protecting personal data, the PSI directive established a triple assessment: (shortened)
- Determine whether the PSI contains personal data.
- Determine whether national access regimes restrict access to the PSI. If yes, the same restrictions apply to PSI publication and re-use as well.
- PSI containing personal data that is opened for re-use should only be processed in compliance to data protection law.
Risks around some key pillars of data protection remain
Although the triple assessment includes important aspects, there are further challenges making the assessment and a potential publication more complex. For example:
- The risk of loss of transparency and purpose limitation
By opening PSI that contains personal data and making it open for re-use transparency for the way and the purpose of re-use of personal data is not necessarily given. Although re-use is generally encouraged, in case the purpose limitation principle that prohibits the use of personal data incompatible with the purpose for which it was collected is part of the licence, this principle could be violated.
- The risk of re-identification
Another challenge is the rather vague concept of personal data. For example, a licence plate without a name for identification is defined as pseudonymous information and as such personal data. If data is anonymised on the other hand, it is not personal data and does not fall under GDPR, e.g. anonymous information on gender, age, race, location and income. However, if the combination of anonymous datasets would allow to determine the identity of a natural person, data protection laws would apply again. E.g. there may be only one possible match by gender, age, race and location. The person could be indirectly identified. This would violate the fundamental privacy rights of that individual.
- The risk of exclusive agreements
To avoid risks of violating data protection by publishing PSI, often bilateral agreements are made to exclusively share data. This does not follow the idea of a level playing field, demanded in the PSI Directive. It poses an entry barrier and misses the altruistic approach of PSI re-use.
New strategies and circumstances for data protection under the PSI Directive
GDPR and innovative technologies create new circumstances for data protection and re-use. The new circumstances can provide solutions or conditions for opening PSI in compliance with data protection law, mitigating the risks.
- Data protection impact assessment
With GDPR coming into effect in May 2018 step two of the PSI Directive triple assessment, the recognition of variations in national data protection laws is mitigated. However, the divergence in national implementation of the PSI directive will persist. An updated common base for a more elaborate assessment should be compiled. In the Opinion on the application of data protection law to the context of PSI a data protection impact assessment (DPIA) is recommended. This is further supported by the GDPR (article 35) to address the involvement of new technologies.
- Data Officer
Furthermore, with GDPR, the mandatory role of a data officer for public bodies was installed (article 37). With that role and a new assessment literate decision making for opening PSI is supported.
- Data anonymisation
Under GDPR the bar for anonymisation is set much higher to tackle the risks of re-identification. The implementation of technical solutions for anonymization or anonymous by design solution will solve challenges around costs of anonymisation and parts of the risks of re-identification. (Although in some cases it will still be possible to deduce identities.)
Application programming interfaces (APIs) that can be switched off in case of data protection challenges or violations are additional solutions to address risks of re-identification and purpose limitation.
Licenses can aim at controlling the use of personal data in PSI to some extent including prohibiting re-identification attempts and restricting the purposes of re-use. Finally, data sharing platforms that allow organisations to derive insights from algorithms processing other organisations data without actually accessing it are future solutions of data sharing in compliance with data laws that need to be taken into account when updating the PSI directive.
The future of the PSI directive and data protection
The biggest challenge of PSI re-use is the unavailability of open PSI data (locked data) and the re-usability of data. To support both and enable next generation data re-use like real-time data re-use through APIs, a revised PSI directive should address the new circumstances around data re-use in terms of data protection and new technologies. It should also highlight the fact that not only data publishers but also data re-users have the responsibility to process open PSI in a way that is compliant with data protection law. In that way further potential of PSI re-use can be leveraged.