Full-population web crawl of .gov.uk web domain, 2014

Nicholls, Tom (2019). Full-population web crawl of .gov.uk web domain, 2014. [Data Collection]. Colchester, Essex: UK Data Archive. 10.5255/UKDA-SN-852205

This project engages with the Digital Era Governance (DEG) work of Dunleavy et. al. and draws upon new empirical methods to explore local government and its use of Internet-related technology. It challenges the existing literature, arguing that e-government benefits have been oversold, particularly for transactional services; it updates DEG with insights from local government. The distinctive methodological approach is to use full-population datasets and large-scale web data to provide an empirical foundation for theoretical development, and to test existing theorists’ claims. A new full-population web crawl of .gov.uk is used to analyse the shape and structure of online government using webometrics. Tools from computer science, such as automated classification, are used to enrich our understanding of the dataset. A new full-population panel dataset is constructed covering council performance, cost, web quality, and satisfaction. The local government web shows a wide scope of provision but only limited evidence in support of the existing rhetorics of Internet-enabled service delivery. In addition, no evidence is found of a link between web development and performance, cost, or satisfaction. DEG is challenged and developed in light of these findings. The project adds value by developing new methods for the use of big data in public administration, by empirically challenging long-held assumptions on the value of the web for government, and by building a foundation of knowledge about local government online to be built on by further research. This is an ESRC-funded DPhil research project.

Data description (abstract)

This dataset is the result of a full-population crawl of the .gov.uk web domain, aiming to capture a full picture of the scope of public-facing government activity online and the links between different government bodies. Local governments have been developing online services, aiming to better serve the public and reduce administrative costs. However, the impact of this work, and the links between governments’ online and offline activities, remain uncertain. The overall research question for this research examines whether local e-government has met these expectations, of Digital Era Governance and of its practitioners. Aim was to directly analyse the structure and content of government online. It shows that recent digital-centric public administration theories, typified by the Digital Era Governance quasi-paradigm, are not empirically supported by the UK local government experience. The data consist of a file of individual Uniform Resource Locators (URLs) fetched during the crawl, and a further file containing pairs of URLs reflecting the Hypertext Markup Language (HTML) links between them. In addition, a GraphML format file is presented for a version of the data reduced to third-level-domains, with accompanying attribute data for the publishing government organisations and calculated webometric statistics based on the third-level-domain link network.

Data creators:
Creator Name Affiliation ORCID (as URL)
Nicholls Tom Oxford Internet Institute, University of Oxford http://orcid.org/0000-0002-6971-8614
Sponsors: Economic and Social Research Council
Topic classification: Media, communication and language
Keywords: web crawl, central government, local government, decentralized government, websites, webometrics
Project title: Digital Era Local Governance in England
Grant holders: Tom Nicholls
Project dates:
1 October 201122 April 2016
Date published: 24 Jul 2019 15:27
Last modified: 24 Jul 2019 15:27

Available Files

Data and documentation bundle

Read me


data downloads and page views since this item was published

View more statistics



Digital era local government in England

Edit item (login required)

Edit Item Edit Item