The European Data Portal (EDP) facilitated an online workshop on “High value datasets: What are the lessons learned from INSPIRE?”. The online workshop started with an introduction to high value datasets (HVDs). We then moved on to an interactive part during which the 177 participants could contribute their thoughts and questions online.
The Directive on open data and the re-use of public sector information (Directive (EU) 2019/1024) introduces the concept of HVDs, the re-use of which is associated with essential benefits for the society and economy. The EU Member States are expected to ensure the availability of HVDs free of charge, in machine-readable formats, provided via Application Programming Interfaces (APIs) and as bulk downloads.
It is not yet known which datasets will be considered high value datasets. The European Commission is currently working with the Member States on identifying high value datasets to be set out in an Implementing Act. The Open Data Directive lists the following thematic categories of HVDs:
- Earth observation and environment
- Companies and company ownership
During the workshop there was a lively discussion in the chat about why “geospatial” was named as a thematic category and how this should be interpreted, geospatial being a type of data which is often found in the different domains also named as thematic categories (e.g. most Earth Observation, environmental and meteorological data can be considered “geospatial”).
What datasets should be considered High Value Datasets?
The first online “discussion” focused on which datasets the participants would consider to be high value datasets. Participants used an online tool (sli.do) to type in their ideas and vote for the inputs of their peers. The answer most strongly advocated was “INSPIRE Annex I data”. For everyone not familiar with the INSPIRE Directive this will need some explanation: The INSPIRE Directive lists data themes in the three Annexes of the directive. INSPIRE Annex I data themes consist of:
- Administrative units
- Cadastral parcels
- Geographical names
- Protected sites
- Transport networks
Other entries with strong support referred to further datasets covered by INSPIRE such as energy resources, water, or natural risk zones. One statement frequently voted for suggested to focus on data that is not strongly addressed by INSPIRE such as health or energy consumption. While almost all suggestions have a strong spatial relevance, also topics with little or no spatial reference were named: This includes data linked to “economic processes” and “composition of product”.
Figure 1: What datasets should be considered High Value Datasets (answers from workshop)
Lessons from INSPIRE
With an ambitious endeavour such as building INSPIRE, there are quite a few lessons to be learned in the process. The workshop invited the participants to share their experiences with INSPIRE which could be useful for the implementation of high value datasets. These “lessons from INSPIRE” were named:
- Technical specifications need to be simplified for 'non-specialist' data providers
- Complexity in requirements hurts the use of data
- Data structures have to be "easy"
- INSPIRE is too complex for data providers
- Users and data providers do not understand models of INSPIRE, especially from Annex III
- Complex GML is too complicated for users and tools. Change to alternative formats such as GeoJson, Simple GML, and ... HTML.
- To define clear (and few) priorities
- Focus on demand
- Focus on data that can be used in relevant use cases
- Need for user-driven requirements
- A single national dataset published by a coordinated process is better than multiple regional submissions
- Skip the harmonising part to get started quickly. Unharmonised data is better than no data
- One size does not fit all
- pan European / cross border data need interoperability: common data models are needed, but re-use of simpler structures/encoding - ref. Alternative encodings
- Allow publishing of data as-is but let them be well documented
- Standard-based harmonisation of datasets improves the quality of data to make available as open data
- Data specifications do not allow us to reach a suitable level of data interoperability for European usage. Data harmonisation should be a step forward
- do not deviate from (or extend) international standards
- Tie specification and implementation much closer to each other - so that technical choices are still relevant when implementation is finished
- HVD should not impose new technical formats or models, but use existing ones
- bulk download with good metadata description
- Include services as well as data. Or at least support the development of services. APIs give an opportunity to develop important microservices, e.g. gazetteer
- Provide high quality central tools for validation, early in the process
- The validation instruments should be easier to use than for INSPIRE. Many of the error reports are very cryptic.
- Be very clear with legal requirements so that there is no space for interpretations
- Huge organisational efforts were made by member states to implement INSPIRE, resulting in valuable assets to capitalise on further
- The INSPIRE strategy should match the digital transformation process developing within the industry and the public services - where are the important user stories?
- Focus on building good catalogue capabilities to enable discoverability (including differentiation between spatial and non-spatial data types)
- Providing an infrastructure allowing the discovery and sharing of spatial data.
An enthusiastic discussion evolved around whether usage of HVD would “get flowing” more than the usage of INSPIRE data and services. Participants agreed that this is because INSPIRE does not oblige to share freely and are looking forward to more open licenses (while some worried about the costs and are hoping for funding).
While the workshop had initially been planned as a face-to-face event with discussions in smaller groups, this online workshop offered the advantage of having a much larger audience: 177 participants attended online as opposed to a planned number of 30 participants in Dubrovnik.
With this workshop, the team from the European Data Portal accomplished their goal to provide a space for participants mainly from the INSPIRE community to share and discuss their views on high value datasets and their lessons on data sharing.
Thank you to your participants for joining the workshop. Missed the workshop? The presentation and recording are available on the INSPIRE 2020 website. Want to stay up to date about high value datasets? Follow us on Twitter, Facebook or LinkedIn to stay up to date!
This article was written by Antje Kügeler, a project manager for con terra.