Edwards, Peter and Markovic, Milan and Petrunova, Nikol and Chenghua, Lin and Corsar, David
(2018).
Tweets used to study reports of food fraud related to fish products 2018.
[Data Collection]. Colchester, Essex:
UK Data Service.
10.5255/UKDA-SN-853378
Social media and other forms of online content have enormous potential as a way to understand people's opinions and attitudes, and as a means to observe emerging phenomena - such as disease outbreaks. How might policy makers use such new forms of data to better assess existing policies and help formulate new ones?
This one year demonstrator project is a partnership between computer science academics at the University of Aberdeen and officers from Food Standards Scotland which aims to answer this question. Food Standards Scotland is the public-sector food body for Scotland created by the Food (Scotland) Act 2015. It regularly provides policy guidance to ministers in areas such as food hygiene monitoring and reporting, food-related health risks, and food fraud.
The project will develop a software tool (the Food Sentiment Observatory) that will be used to explore the role of data from sources such as Twitter, Facebook, and TripAdvisor in three policy areas selected by Food Standards Scotland:
- attitudes to the differing food hygiene information systems used in Scotland and the other UK nations;
- study of an historical E.coli outbreak to understand effectiveness of monitoring and decision making protocols;
- understanding the potential role of social media data in responding to new and emerging forms of food fraud.
The Observatory will integrate a number of existing software tools (developed in our recent research) to allow us to mine large volumes of data to identify important textual signals, extract opinions held by individuals or groups, and crucially, to document these data processing operations - to aid transparency of policy decision-making. Given the amount of noise appearing in user-generated online content (such as fake restaurant reviews) it is our intention to investigate methods to extract meaningful and reliable knowledge, to better support policy making.
Data description (abstract)
Data collected from Twitter social media platform (8 June 2018 - 22 June 2018) to study reports of food fraud related to fish products on social media from posts originating in the UK. The dataset contains Tweet IDs and keywords used to search for Tweets using a programatic access via the public Twitter API. Keywords used in this search were generated using a machine learning tool and consisted of combinations of keywords describing terms related to fish and fake.
Data creators: |
Creator Name |
Affiliation |
ORCID (as URL) |
Edwards Peter |
University of Aberdeen |
|
Markovic Milan |
University of Aberdeen |
|
Petrunova Nikol |
University of Aberdeen |
|
Chenghua Lin |
University of Aberdeen |
|
Corsar David |
University of Aberdeen |
|
|
Sponsors: |
Economic and Social Research Council
|
Grant reference: |
ES/P011004/1
|
Topic classification: |
Law, crime and legal systems
|
Keywords: |
social media, policy making, fish (as food)
|
Project title: |
The Food Sentiment Observatory: Exploiting New Forms of Data to Help Inform Policy on Food Safety and Food Crime Risks
|
Grant holders: |
Peter Edwards, Bryan Campbell, Jacqui Mcelhiney, Chenghua Lin, Susan Pryde, Tigan Daspan, Ross Clark, Robin White
|
Project dates: |
From | To |
---|
14 February 2017 | 31 July 2018 |
|
Date published: |
16 Nov 2018 13:49
|
Last modified: |
16 Nov 2018 13:49
|
Temporal coverage: |
From | To |
---|
8 June 2018 | 22 June 2018 |
|
Collection period: |
Date from: | Date to: |
---|
18 June 2018 | 22 June 2018 |
|
Country: |
United Kingdom |
Spatial unit: |
Other |
Data collection method: |
The search for relevant data content was performed using a custom built data collection module within the Observatory platform (see Related Resources). A public API provided by Twitter was utilised to gather all social media messages (Tweets) matching a specific set of keywords. Each line in the fish-keywords.txt file (group 1) and in the fake-keywords.txt file (group 2) contains a search keyword/phrase. A list of search keywords was then created from all possible combinations of individual keywords/phrases form group 1 and group 2. A matching Tweet, returned by the search had to include at least one combination of such search keywords/phrases. Therefore, the search string used by the API was constructed as follows: (<keyword1 from group1> <keyword1 from group 2>) OR (<keyword1 from group1> <keyword2 from group 2>) OR ... *Note: the space between <> <> represents a logical AND in terms of the Twitter API service. The Twitter API allows historical searches to be restricted to Tweets associated with a specific location, however, this can be only specified as a specific radius from a given latitude and longitude geo-point. We used Twitter's geo-resticted search by defining a Lat/Long point and radius (in kilometres). In order to cover major areas in the UK we used the following four geo-restrictions: Latitude =57.334942 Longitude=-4.395858 Radius = 253 km; Latitude =55.288000 Longitude=-2.374374 Radius = 282 km; Latitude =52.250808 Longitude=-0.660507 Radius = 198 km; Latitude =51.953880 Longitude=-2.989608 Radius = 198 km. |
Observation unit: |
Individual, Organization, Event/Process, Geographic unit |
Kind of data: |
Text |
Type of data: |
Other surveys |
Resource language: |
English |
|
Data sourcing, processing and preparation: |
Duplicate Tweet IDs returned as a result of overlapping georadius areas were filtered out in the attached file.
|
Rights owners: |
Name |
Affiliation |
ORCID (as URL) |
Edwards Peter |
University of Aberdeen |
|
|
Contact: |
Name | Email | Affiliation | ORCID (as URL) |
---|
Markovic, Milan | milan.markovic@abdn.ac.uk | University of Aberdeen | Unspecified |
|
Notes on access: |
The Data Collection is available to any user without the requirement for registration for download/access.
|
Publisher: |
UK Data Service
|
Last modified: |
16 Nov 2018 13:49
|
|
Edit item (login required)
|
Edit Item |