data.gov.uk

data.gov.uk is the official Open Data portal of the UK Government. The site provides a central way into the wealth of government data, and aims to make that data ‘easy to find, easy to license, and easy to re-use.’ A beta version went live in October 2009, and the site was publicly launched by Sir Tim Berners-Lee, the inventor of the World Wide Web, in January 2010.

data.gov.uk is built using CKAN to catalogue, search and display data. (Other aspects of the site, such as blog posts, forums, comments, etc, are handled by Drupal, the open source CMS.) The Open Knowledge Foundation’s CKAN team were involved in the site from its inception, and helped develop and maintain it for the first two years. In early 2012 the UK government took its CKAN work in-house, but they continue to work closely with the CKAN team and make regular code contributions back to CKAN – a striking example of the advantages of open-source projects.

[IMG: The new data.gov.uk site]

data.gov.uk: searching for data

The site was originally designed with Drupal handling all page requests, and CKAN as a back-end catalogue service. CKAN’s rich metadata and search API made it possible to do this. However, by the time data.gov.uk relaunched in June 2012, it was clear that CKAN’s excellent web interface provided a better search experience, while maintaining a Drupal module to handle these API calls was an unnecessary duplication of effort. Instead, all requests for data (the ‘Data’ tab at data.gov.uk) are now sent by the web server straight to CKAN, while other tabs are handled by Drupal. This side-by-side CKAN / Drupal integration is described by David Read on the data.gov.uk blog here.

The initial requirements for data.gov.uk were data catalogue capabilities (entering, editing, listing, and searching datasets) combined with basic CMS features (site content, blog, theming etc). In addition, CKAN has delivered in the following areas:

  • Robustness
  • Data validation and quality checking
  • Editing workflow
  • Complex search including faceting by tag, government department, theme, etc
  • RDF representations of dataset metadata
  • Integration with the Drupal content management system
  • Support for geospatial metadata, especially in relation to INSPIRE requirements for search and discovery
  • Automated ‘harvesting’ of metadata and material from 3rd sources within and outside government
  • Graph previews and map visualisations for spreadsheet data
  • General support and maintenance

The data.gov.uk site, including its CKAN component, has also undergone penetration testing by UK government security testing consultants.

The UK government continues to use and develop data.gov.uk, and the site has a global reputation as a leading exemplar of a government data portal. The system has successfully handled growth from a few dozen to many thousands of datasets, and a concomitant growth in site traffic. It has played a significant enabling role in the development of the UK government’s transparency and open data agenda.

[IMG: data.gov.uk resource page]

A data resource on data.gov.uk

Andrew Stott, responsible for the launch of data.gov.uk while UK Government Director of Transparency & Digital Engagement, had this to say: “Using Open Source was the best decision we ever made. A big thumbs up to the Open Knowledge Foundation and CKAN.”