The difference between the ODP and EDP is that ODP is the Open Data portal of the European Union containing datasets that are collected and published by the European Institutions. EDP is a European portal that harvests metadata from public sector portals throughout Europe. EDP therefore focuses on data made available by European countries. In addition, EDP also harvests metadata from ODP. The European Commission is currently exploring how to bring those two portals closer together.
The European Data Portal pilots MT@EC, which is a European Commission machine translation service, covering all of the EU's official languages. EDP harvests metadata in the original language of the source portal, which is then translated. Those translations are machine automated translations and could therefore contain errors.
The European Union has adopted legislation to foster the re-use of Open (Government) Data. All the data are available for free and can be used for business creation.
The European Data Portal’s User Manual can be found here.
The European Data Portal initial content has been collected by harvesting national public data portals. Progressively, the portal will harvest additional data collected by regional, local and domain specific portals.
There are several technical requirements that you must provide in order to be harvested by the European Data Portal. You could find on this page a checklist of the different technical features expected.
If you need further details, you will find a complete documentation of the process in this file.
License: an explicit and legally binding statement of rights, restrictions and obligations of recipients in relation to a specific dataset. Usually expressed through a written contract or through a unilateral statement from the rights holder(s), but it may also be expressed through legislation or other regulatory initiatives.
The European Data Portal harvests all datasets from national, regional and local portals without excluding certain datasets. That means we do not have any influence on the type of licence used, as the licence is provided by the source. However, the promotion of the use of open licences is something we will continue promoting and recommending in the context of the European Data Portal project as well.
The European Data Portal harvests all datasets from national, regional and local portals without excluding certain datasets. That means that the portal does not have any influence on the file format used, as the file is provided by the source. However, the promotion of the use of open file formats is something we will continue promoting and recommending in the context of the European Data Portal project as well.
The datasets stored in the portal need to be of an appropriate quality in the terms of:
In order to check the datasets for these quality indicators the Metadata Quality Assurance (MQA) tool was developed. The MQA runs as a periodic process in parallel to the harvesting. CKAN and Virtuoso are filled with metadata through the harvesting process. As CKAN cannot store DCAT-AP formatted datasets directly, the datasets are mapped into a JSON schema that is DCAT-AP compliant. The MQA uses this schema for checking each dataset for its DCAT-AP mapping compliance. If there are any compliance issues detected, for instance a mandatory field is missing, a dataset is considered as not DCAT-AP compliant.
The MQA uses the CKAN API for collecting information about all harvested catalogues, MQA runs through all CKAN catalogues in parallel while collecting the required information to fulfil the quality checks. During this process, several checks are performed for each dataset. The results are stored in the MQA database and propagated via the MQA page on the portal or as downloadable sheets and pdf documents. Downloadable MQA documents are only updated after a MQA run has finished. For one run the MQA needs a couple of days. That is because the MQA checks each distribution of each dataset for its availability. Checking a distribution availability may take several seconds, with almost 800.000 datasets with 2 to 50 distributions per dataset, this takes some time.
The MQA presents its results in two views:
The current quality indicators include the following:
(*)The Top 20 indicators are only available for the Global Dashboard View.
Most results of the MQA are presented in charts (pie-charts, bar-charts). I you need further information for a chart, you can always click on the "i" icon in upper right corner of each chart that will provide you additional help. Some charts have the label "?" in the x-axis. This indicates an aggregation of unknown or not-set-entities in the data. For instance, if a chart shows the most used distribution formats and for some distributions, no format is provided.
The visualization tool is dependent on the files provided by the source. It might happen that the format is not accepted or that the files are corrupted at the source. European Data Portal has no influence on the datasets from the harvested portals.
The map search enables to find datasets from a specific region. You only have to type in the region or draw a bounding box on the map, but results are only displayed for datasets that have geo information stored.
API access URLs can be found here:
CKAN: https://www.europeandataportal.eu/data/search/ (Note: Only 'Read-Only' actions are currently supported for this API)
Use Cases: https://www.europeandataportal.eu/en/export-use-cases
API Documentation is available for the following system:
Download MQA reports: https://www.europeandataportal.eu/api/mqa/reports
Data used by MQA: https://www.europeandataportal.eu/metrics/
SHACL metadata validation: https://www.europeandataportal.eu/shacl/
Read access to triple store data content: https://www.europeandataportal.eu/data/api/
Integration on any external application with the European Data Portal can only happen at the dataset level by using the existing CKAN-API, via which you may "extract/query" datasets.
e.g. the API calls "https://www.europeandataportal.eu/data/search/ckan/package_search" and returns the list of dataset categorories in Json format.
You can also use the SPARQL-Manager and run customized SPARQL queries against the Virtuoso RDF triple store that is synchronized with the CKAN repository.
Datasets can be exported to WMS, WFS, KML, HTML, Excel, PDF, XML, JSON, RSS, GML, SVG, SHP, PNG, JPEG, GIF, RDF-XML, RDF-Turtle, RDF-N3, OCTET STREAM, JSON-LD and Atom.
The European Data Portal contains different search engines which have different behaviors: