European Data Portal

Checklist

Follow us  Twitter Facebook Google+ LinkedIn

Checklist

Important steps to go through before using the data.

Having access to data is a first step. Data is not an end in itself. Data can be used in different ways and for different purposes. Data can also be available with different licenses, formats and quality.


Your purpose

Define your purpose: There are different purposes for which the use of Open Data can add value to your activities. It can provide insight into a specific topic that you want more information about or even write about (i.e. data journalism). Open Data can also add required information to an application or service, like details about schools if you are developing an application to help find the best school for yourself or your children. Businesses can also use Open Data to improve their customer profiles and are able to fit the needs of their customers better. Whether it is for private or commercial use, Open Data offers a lot of possibilities.

Identify Data labels: If you know for which purpose you need data, it is important to look whether the data fits your needs by looking at the data labels and metadata (data about the data). For example, if you want to build an application that gives advice about the best primary education in the neighbourhood, you need to check whether the data set that you would like to use includes schools that give primary education, covers the specific area you want to include and whether performance indicators are available.

 

Open license

Check Openness: Take a look at the licence information provided about the data set. Make sure a licence is available which allows you to make use of the data in the way that you intend (e.g. that commercial re-use is allowed if you develop a commercial application).

Check Attribution requirements: It is possible that the licence states that people who use the data must credit whoever is publishing it, which means that you need to credit the owner when you make your product or service available. This is called attribution.

Check Share-Alike requirements: If it indicates that people who mix the data with other data have to also release the results as Open Data, you are obliged to publish your own data under a similar licence after adding other data to the original source. This is called share-alike. Make sure that the licence is in line with your purpose for using the data.

In the absence of a license, there is no information about the terms and conditions applicable! You may want to contact the owner of the data to check what uses are allowed

 

File format

After you have decided that a specific data set is exactly what you are looking for, you are probably able to choose to download the datasets in different file formats. Depending on your computer skills, you can choose the file type that is most appropriate. The most common file format for tabular data is ‘.csv’. It allows you to add other information to the file or make calculations with the data. Datasets that can be adjusted are published using an open file format. Most datasets are available in an open file format, but bear in mind that some formats (e.g. ‘.pdf’) are not changeable.

 

Data Quality

On the page where you want to download the data set, you should find information about the last date the file was modified. If you require data for a specific period of time, you need to check whether information about the time period is provided or it has been updated recently. You should check whether the information you were expecting to find in the file is actually included and you understand the different labels.

 

Here is a short checklist developed by the Open Data Institute:

Form

  • how has the data been processed?
  • is it in raw or summary form?
  • how will its form affect your analysis/product/application?
  • what syntactic (language) and semantic (meaning) transformations will you need to make?
  • is this compatible with other datasets you have?

Quality

  • how current is the data?
  • how regularly is it updated?
  • do you understand all the fields and their context?
  • for how long will it be published? what is the commitment by the publisher?
  • what do you know about the accuracy of the data?
  • how are missing data handled?

Look around at the European Data Portal and discover how it fits your data needs.

Your feedback will help us to improve the overall user experience. Any suggestions?
Version 1.0 / Last update: 31/05/2016 Top