Intonational variation in Arabic Corpus 2011-2017

Hellmuth, Sam and Almbark, Rana (2019). Intonational variation in Arabic Corpus 2011-2017. [Data Collection]. Colchester, Essex: UK Data Archive. 10.5255/UKDA-SN-852878

Twenty five countries have Arabic as an official language, but the dialects spoken vary greatly, and even within one country different accents are heard. Many features create the impression of 'a different accent', including how particular sounds are pronounced, where stress falls in a word, and what intonation pattern is used. There is extensive prior research on the first two of these for Arabic, but few descriptions of the intonation of individual dialects, and what is known is based on different data types so direct comparisons cannot be made.

The Intonational Variation in Arabic project is hosted by the Department of Language and Linguistic Science at the University of York, a leading centre for sociophonetic research. Adapting methodology from earlier ESRC funded work on English (see Related Resources) the project will generate a public-access corpus of Arabic speech, using a parallel set of sentences, stories and conversations, recorded with 18-30 year olds in eight regions of the Arab world. Additional data from older speakers (aged 40-60) will reveal changes in progress and local variation. Detailed prosodic analysis will yield intonational descriptions of individual dialects and cross-dialectal comparisons, for use by linguists, learners and teachers of Arabic and other users.

Data description (abstract)

The Intonational Variation in Arabic corpus employed a multi-layered set of data collection instruments, following in the footsteps of the Intonational Variation in English (IViE) project. A range of different tools are used to collect speech recordings, to systematically vary certain variables of interest, and control others, and in a range of styles, from scripted to spontaneous speech.

Data creators:

Creator Name	Affiliation	ORCID (as URL)
Hellmuth Sam	University of York	http://orcid.org/0000-0002-0062-904X
Almbark Rana	University of York	http://orcid.org/0000-0002-4784-2497

Sponsors:

Economic and Social Research Council

Grant reference:

ES/I010106/1

Topic classification:

Media, communication and language
Society and culture

Keywords:

Arabic, intonation, phonology, phonetics, dialects, speech, variation

Project title:

Intonational Variation in Arabic

Grant holders:

Sam Hellmuth

Project dates:

From	To
2 April 2011	30 June 2017

Date published:

24 Nov 2017 13:37

Last modified:

02 Jan 2019 11:47

Coverage and Methodology

Collection period:

Date from:	Date to:
2 April 2011	30 June 2017

Geographical area:

Middle East and North Africa

Country:

Morocco, Tunisia, Egypt, Jordan, Syria, Iraq, Kuwait, Oman

Data collection method:

The Intonational Variation in Arabic (IVAr) corpus data was collected using a multi-layered set of elicitation instruments, following in the footsteps of the Intonational Variation in English (IViE) project (http://www.phon.ox.ac.uk/IViE/). The data ranges from fully scripted read speech (scripted dialogue and read narrative) to (semi-)spontaneous unscripted speech (narrative retold from memory, map tasks and free conversation). Copies of all elicitation instruments are provided as part of the corpus.

The corpus comprises data collected with 12 speakers (6 female/6 male) each in ten datasets across eight regionally defined varieties of Arabic. We worked with a local research fieldwork assistant or host in each recording location, whose role included recruitment of an opportunity sample of participants, controlling for age, gender and first language dialect of Arabic. All participants were aged 18 or over and provided informed consent for use and distribution of their speech data as part of the IVAr corpus. The research was approved by the University of York Health and Social Science Ethics Committee.

Speech recordings were made on location in the Middle East and North Africa. It was necessary to collect the datasets for speakers originally from Damascus and Baghdad in Amman, Jordan, due to the prevailing security situation at the time. All other datasets were collected on location in the field in the town or city of residence of speakers. Each recording session was run by a paid local fieldwork assistant who was a first language speaker of the dialect in question. Recordings were made using a Marantz PMD661 solid state data recorder directly to digital format (.wav) at 44.1kHz 16 bit, using Shure SM10A-CN headworn dynamic cardioid microphones. Tasks performed by participants in pairs were recorded on separate tracks in a stereo audio file to facilitate later separate analysis of each individual’s speech.

Observation unit:

Individual

Kind of data:

Text, Audio

Type of data:

Qualitative and mixed methods data

Resource language:

Arabic

Access and Administration

Data sourcing, processing and preparation:

Data were collected on location in each respective country, except for speakers of Syrian and Iraqi dialects, who were recorded in Jordan. All identifiable information (e.g. mention of participants' real names) have been redacted from audio files and transcripts. The text of the well known Arabic folktale used to elicit read and retold speech (sto + ret) with each speaker group (and thus in each dialect) was adapted from and inspired by a version of the story published as 'Guha and the banana seller' in Abdel-Massih, E.T. 2011. An introduction to Egyptian Arabic. Ann Arbor, University of Michigan. page 269-270. It is used by permission of the University of Michigan Center for Middle Eastern & North African Studies.

Rights owners:

Name	Affiliation	ORCID (as URL)
Hellmuth Sam	University of York	http://orcid.org/0000-0002-0062-904X

Contact:

Name	Email	Affiliation	ORCID (as URL)
Hellmuth, Sam	sam.hellmuth@york.ac.uk	University of York	http://orcid.org/0000-0002-0062-904X

Notes on access:

Some files are available to any user without the requirement for registration for download/access, others require registration.

Publisher:

UK Data Archive

Last modified:

02 Jan 2019 11:47

Available Files

Downloads

data downloads and page views since this item was published

View more statistics

Altmetric

Related Resources

Website

Intonational Variation in Arabic on RCUK Gateway

Intonational Variation in Arabic project website

The IViE Corpus: English Intonation in the British Isles

'Guha and the banana seller' in Abdel-Massih, E.T. 2011. An introduction to Egyptian Arabic. Ann Arbor, University of Michigan. page 269-270

Edit item (login required)

Edit Item