Geography of digital inequality

Blank, Grant (2017). Geography of digital inequality. [Data Collection]. Colchester, Essex: UK Data Archive. 10.5255/UKDA-SN-851760

The objective of the Geography of Digital Inequality project was to explore the geographical contours of Internet use and penetration in Britain. Specifically, the project assembled from existing datasets a new dataset which contains Internet information at fine-grained geographic levels, census output areas (OAs). From OAs we were able to aggregate to higher geographic levels such as counties, Welsh and Scottish Councils, metropolitan areas, or others. Through this unique dataset we explored digital divides and the geography of the Internet, a capability possessed by no other dataset. Specifically, we explored the extent of use versus non-use of the Internet.

Data description (abstract)

These data consist of measures of Internet use estimated using small area estimation. The small area estimation is based on census Output Areas (OAs) using the 2013 Oxford Internet Survey (OxIS) and the 2011 British census. There is an estimate for each OA in Great Britain.

By combining the 2013 OxIS survey data with the comprehensive small area coverage of the 2011 British census we can use the strengths of one to offset the gaps in the other. Specifically, we follow a two-step process. First, we use the information that is reliably available in OxIS to create model that estimates the proportion of Internet users in OAs. Second, we use the parameters from this model combined with census data to estimate the proportion of Internet users each OA in Britain. Once these estimates are available, we aggregate the estimates up to higher levels of geography. In this way we can estimate Internet use in Glasgow, Manchester and Cardiff as well as other small areas in Britain. This procedure is referred to as indirect, model-based or synthetic estimation. In recent years such SAE techniques have been widely used throughout Europe and North America. See the project website for more details.

Data creators:

Creator Name	Affiliation	ORCID (as URL)
Blank Grant	University of Oxford	http://orcid.org/0000-0002-6821-0958

Sponsors:

Economic and Social Research Council

Grant reference:

ES/K00283X/1

Topic classification:

Media, communication and language
Social stratification and groupings
Society and culture

Keywords:

Inequality of Internet use, geographically referenced data, small area estimation, Internet, Digital divide

Project title:

Geography of Digital Inequality

Grant holders:

Grant Blank, PI, Mark Graham, Co-I

Project dates:

From	To
16 June 2013	16 December 2014

Date published:

21 May 2015 15:34

Last modified:

14 Jul 2017 10:05

Coverage and Methodology

Temporal coverage:

From	To
14 April 2013	14 April 2013

Geographical area:

England, Scotland and Wales

Country:

United Kingdom

Spatial unit:

Census Geography > Output Areas

Data collection method:

There were 2 datasets used to assemble this dataset. First, the 2013 Oxford Internet Survey (OxIS) is a random sample of the 2657 people age 14+ from the British population (England, Scotland & Wales). Interviews were conducted face-to-face by an independent survey research company. The response rate for 2013 was 51%. The data collection was a two-stage sample. A random sample of census output areas (OAs) was selected and respondents were randomly sampled within each selected OA. For details, see "Data collection technical report.pdf" which has been uploaded. We use six variables from OxIS: Internet use, region, age, lifestage, gender and education. The questionnaire for OxIS contains about 300 variables and it is available from the OxIS website, see the URL in the "related resources" section.

Second, the 2011 British Census. For information on how the census was conducted,see the census website. The URL for the 2011 census is given below in "related resources".

Observation unit:

Geographic unit

Kind of data:

Numeric

Type of data:

Geospatial data

Resource language:

English

Access and Administration

Data sourcing, processing and preparation:

We estimated the parameters of our model using a linear 2-level mixed model. We chose this procedure because it allows for inclusion of demographic variables while taking into account the fact the respondents in each OA will be more alike than a random sample of the population as a whole. Taking into account the clustering of respondents in sampled OAs allows more accurate confidence intervals and significance tests. These confidence intervals will generally be smaller and more conservative than confidence intervals that ignore the clustered data (Goldstein, 2003) This procedure is essentially identical to the approach used by the Small Area Estimation Programme of ONS. ONS describe their work as “regression synthetic estimation fitted using area-level covariates’ (Heady et al, 2003) Although OxIS has hundreds of variables available, we are constrained to use only those variables that exist in both the census and OxIS. The variables in mixed models are divided into fixed effects and random effects. Fixed effects variables are those where we have all possible categories. The fixed effects that we used in the model are region, age, lifestage, gender and education. Random effects variables are those where we have a random sample from a wider population of possible categories, in our study the OxIS OAs are such a sample. The OxIS sample has 260 OAs but to reduce data collection costs they were chosen in adjacent pairs, where the adjacent OA that is most similar in terms of ACORN type was selected. Because not every OA in a pair contained respondents, there are actually 134 “pairs” in the random effects part of the model. This models the random effects as random intercepts; that is, the intercepts for each OA pair are allowed to vary. We weighted each OA pair according to the number of respondents, so that pairs with more respondents (which have smaller standard errors) are weighted more heavily. Respondents themselves are counted based on post-stratification weights that were created so that the sample matched the British population on gender, age, region, rurality, ACORN code and household size (see Dutton & Blank, 2013) for details) We aggregated the OxIS data to the OA level; in the jargon of mixed models this is an area-level model. We cannot use individual level covariates because the census does not provide individual-level microdata. The dependent variable is the proportion of Internet users in each of the 260 OAs. The independent variables are region, age, lifestage, gender and education. The independent variables are also proportions. The random effects are the OA pair. These are random intercept models, to account for the fact that different OA pairs will have different overall levels of Internet use. We used the coefficients from this model to create the small area estimates of the proportion of internet users for all the OAs in Britain. An identical process was used to create the other three variables: the proportion of reader users in each OA, the proportion who access email on their mobile phone, and the proportion who access the Internet while traveling. The final dataset contains 5 variables: 2011 Census OAs, which come directly from the census, plus four variables which are estimated proportions of the four different kinds of Internet use described in the previous paragraph for each OA in Britain.

Rights owners: