Hellmuth, Sam and Almbark, Rana
(2019).
Intonational variation in Arabic Corpus 2011-2017.
[Data Collection]. Colchester, Essex:
UK Data Archive.
10.5255/UKDA-SN-852878
Twenty five countries have Arabic as an official language, but the dialects spoken vary greatly, and even within one country different accents are heard. Many features create the impression of 'a different accent', including how particular sounds are pronounced, where stress falls in a word, and what intonation pattern is used. There is extensive prior research on the first two of these for Arabic, but few descriptions of the intonation of individual dialects, and what is known is based on different data types so direct comparisons cannot be made.
The Intonational Variation in Arabic project is hosted by the Department of Language and Linguistic Science at the University of York, a leading centre for sociophonetic research. Adapting methodology from earlier ESRC funded work on English (see Related Resources) the project will generate a public-access corpus of Arabic speech, using a parallel set of sentences, stories and conversations, recorded with 18-30 year olds in eight regions of the Arab world. Additional data from older speakers (aged 40-60) will reveal changes in progress and local variation. Detailed prosodic analysis will yield intonational descriptions of individual dialects and cross-dialectal comparisons, for use by linguists, learners and teachers of Arabic and other users.
Data description (abstract)
The Intonational Variation in Arabic corpus employed a multi-layered set of data collection instruments, following in the footsteps of the Intonational Variation in English (IViE) project. A range of different tools are used to collect speech recordings, to systematically vary certain variables of interest, and control others, and in a range of styles, from scripted to spontaneous speech.
Data creators: |
|
Sponsors: |
Economic and Social Research Council
|
Grant reference: |
ES/I010106/1
|
Topic classification: |
Media, communication and language Society and culture
|
Keywords: |
Arabic, intonation, phonology, phonetics, dialects, speech, variation
|
Project title: |
Intonational Variation in Arabic
|
Grant holders: |
Sam Hellmuth
|
Project dates: |
From | To |
---|
2 April 2011 | 30 June 2017 |
|
Date published: |
24 Nov 2017 13:37
|
Last modified: |
02 Jan 2019 11:47
|
Collection period: |
Date from: | Date to: |
---|
2 April 2011 | 30 June 2017 |
|
Geographical area: |
Middle East and North Africa |
Country: |
Morocco, Tunisia, Egypt, Jordan, Syria, Iraq, Kuwait, Oman |
Data collection method: |
The Intonational Variation in Arabic (IVAr) corpus data was collected using a multi-layered set of elicitation instruments, following in the footsteps of the Intonational Variation in English (IViE) project (http://www.phon.ox.ac.uk/IViE/). The data ranges from fully scripted read speech (scripted dialogue and read narrative) to (semi-)spontaneous unscripted speech (narrative retold from memory, map tasks and free conversation). Copies of all elicitation instruments are provided as part of the corpus.
The corpus comprises data collected with 12 speakers (6 female/6 male) each in ten datasets across eight regionally defined varieties of Arabic. We worked with a local research fieldwork assistant or host in each recording location, whose role included recruitment of an opportunity sample of participants, controlling for age, gender and first language dialect of Arabic. All participants were aged 18 or over and provided informed consent for use and distribution of their speech data as part of the IVAr corpus. The research was approved by the University of York Health and Social Science Ethics Committee.
Speech recordings were made on location in the Middle East and North Africa. It was necessary to collect the datasets for speakers originally from Damascus and Baghdad in Amman, Jordan, due to the prevailing security situation at the time. All other datasets were collected on location in the field in the town or city of residence of speakers. Each recording session was run by a paid local fieldwork assistant who was a first language speaker of the dialect in question. Recordings were made using a Marantz PMD661 solid state data recorder directly to digital format (.wav) at 44.1kHz 16 bit, using Shure SM10A-CN headworn dynamic cardioid microphones. Tasks performed by participants in pairs were recorded on separate tracks in a stereo audio file to facilitate later separate analysis of each individual’s speech. |
Observation unit: |
Individual |
Kind of data: |
Text, Audio |
Type of data: |
Qualitative and mixed methods data |
Resource language: |
Arabic |
|
Data sourcing, processing and preparation: |
Data were collected on location in each respective country, except for speakers of Syrian and Iraqi dialects, who were recorded in Jordan.
All identifiable information (e.g. mention of participants' real names) have been redacted from audio files and transcripts.
The text of the well known Arabic folktale used to elicit read and retold speech (sto + ret) with each speaker group (and thus in each dialect) was adapted from and inspired by a version of the story published as 'Guha and the banana seller' in Abdel-Massih, E.T. 2011. An introduction to Egyptian Arabic. Ann Arbor, University of Michigan. page 269-270. It is used by permission of the University of Michigan Center for Middle Eastern & North African Studies.
|
Rights owners: |
|
Contact: |
|
Notes on access: |
Some files are available to any user without the requirement for registration for download/access, others require registration.
|
Publisher: |
UK Data Archive
|
Last modified: |
02 Jan 2019 11:47
|
|
Available Files
Data
Documentation
Read me
Edit item (login required)
|
Edit Item |