Freudenthal, Daniel and Pine, Julian and Jones, Gary and Gobet, Fernand
(2021).
International Centre for Language and Communicative Development: Using a Developmentally Realistic Model of Word Class Acquisition to Simulate Developmental Changes in the Noun-richness of Children's Early Language Across English, Dutch and German, 2014-2020.
[Data Collection]. Colchester, Essex:
UK Data Service.
10.5255/UKDA-SN-853923
The International Centre for Language and Communicative Development (LuCiD) will bring about a transformation in our understanding of how children learn to communicate, and deliver the crucial information needed to design effective interventions in child healthcare, communicative development and early years education.
Learning to use language to communicate is hugely important for society. Failure to develop language and communication skills at the right age is a major predictor of educational and social inequality in later life. To tackle this problem, we need to know the answers to a number of questions: How do children learn language from what they see and hear? What do measures of children's brain activity tell us about what they know? and How do differences between children and differences in their environments affect how children learn to talk? Answering these questions is a major challenge for researchers. LuCiD will bring together researchers from a wide range of different backgrounds to address this challenge.
The LuCiD Centre will be based in the North West of England and will coordinate five streams of research in the UK and abroad. It will use multiple methods to address central issues, create new technology products, and communicate evidence-based information directly to other researchers and to parents, practitioners and policy-makers.
LuCiD's RESEARCH AGENDA will address four key questions in language and communicative development:
1. ENVIRONMENT: How do children combine the different kinds of information that they see and hear to learn language?
2. KNOWLEDGE: How do children learn the word meanings and grammatical categories of their language?
3. COMMUNICATION: How do children learn to use their language to communicate effectively?
4. VARIATION: How do children learn languages with different structures and in different cultural environments?
The fifth stream, the LANGUAGE 0-5 PROJECT, will connect the other four streams. It will follow 80 English learning children from 6 months to 5 years, studying how and why some children's language development is different from others. A key feature of this project is that the children will take part in studies within the other four streams. This will enable us to build a complete picture of language development from the very beginning through to school readiness.
Applying different methods to study children's language development will constrain the types of explanations that can be proposed, helping us create much more accurate theories of language development. We will observe and record children in natural interaction as well as studying their language in more controlled experiments, using behavioural measures and correlations with brain activity (EEG). Transcripts of children's language and interaction will be analysed and used to model how these two are related using powerful computer algorithms.
LuciD's TECHNOLOGY AGENDA will develop new multi-method approaches and create new technology products for researchers, healthcare and education professionals. We will build a 'big data' management and sharing system to make all our data freely available; create a toolkit of software (LANGUAGE RESEARCHER'S TOOLKIT) so that researchers can analyse speech more easily and more accurately; and develop a smartphone app (the BABYTALK APP) that will allow parents, researchers and practitioners to monitor, assess and promote children's language development.
With the help of six IMPACT CHAMPIONS, LuCiD's COMMUNICATIONS AGENDA will ensure that parents know how they can best help their children learn to talk, and give healthcare and education professionals and policy-makers the information they need to create intervention programmes that are firmly rooted in the latest research findings.
Data description (abstract)
We examine the success of developmental
distributional analysis in English, German and Dutch. We
embed the mechanism for distributional analysis within an
existing model of language acquisition (MOSAIC) that
encodes increasingly long utterances, and compare results
against a measure of ‘noun richness’ in child speech. We show
that, cross-linguistically, the mechanism’s success in building
an early noun class is inversely related to the complexity of the
determiner and noun gender system, and that merging of
determiners gives very similar results across languages. These
results suggest that children may represent grammatical
categories at multiple levels of abstraction that reflect both the
larger category as well as its finer structure.
We also examine how a mechanism that learns word classes from distributional information can contribute to the simulation of child language. Using a novel measure of noun richness, it is shown that the ratio of nouns to verbs in young children’s speech is considerably higher than in adult speech. Simulations with MOSAIC show that this effect can be partially (but not completely) explained by an utterance-final bias in learning. The remainder of the effect is explained by the early emergence of a productive noun category, which can be learned through distributional analysis.
Data creators: |
Creator Name |
Affiliation |
ORCID (as URL) |
Freudenthal Daniel |
University of Liverpool |
|
Pine Julian |
University of Liverpool |
|
Jones Gary |
Nottingham Trent University |
|
Gobet Fernand |
University of Liverpool |
|
|
Sponsors: |
Economic and Social Research Council
|
Grant reference: |
ES/L008955/1
|
Topic classification: |
Psychology
|
Keywords: |
LANGUAGE, ENGLISH (LANGUAGE), DUTCH (LANGUAGE), GERMAN (LANGUAGE), CHILD DEVELOPMENT, LANGUAGE DEVELOPMENT, LINGUISTICS, LINGUISTIC ANALYSIS, MODELLING, CHILDREN
|
Project title: |
The International Centre for Language and Communicative Development
|
Grant holders: |
Elena Lieven, Bob McMurray, Jeffrey Elman, Gert Westermann, Morten H Christiansen, Thea Cameron-Faulkner, Fernand Gobet, Ludovica Serratrice, Sabine Stoll, Meredith Rowe, Padraic Monaghan, Michael Tomasello, Ben Ambridge, Silke Brandt, Anna Theakston, Eugenio Parise, Caroline Frances Rowland, Colin James Bannard, Grzegorz Krajewski, Franklin Chang, Floriana Grasso, Evan James Kidd, Julian Mark Pine, Arielle Borovsky, Vincent Michael Reid, Katherine Alcock, Daniel Freudenthal
|
Project dates: |
From | To |
---|
1 September 2014 | 31 May 2020 |
|
Date published: |
31 May 2021 11:53
|
Last modified: |
01 Jul 2021 11:11
|
Collection period: |
Date from: | Date to: |
---|
1 September 2014 | 31 May 2020 |
|
Country: |
United Kingdom |
Data collection method: |
For English we selected the 6 largest sub-corpora from the Manchester corpus (Theakston et al., 2001). For German we selected the Rigol corpus, consisting of 4 children with roughly 45,000 child-directed utterances per child. For Dutch, we selected the two children from the Van Kampen corpus. These corpora contain 65,000 and 25,000 maternal utterances. Analyses concerned children and their mother’s cross-linguistic use of nouns and main verbs, investigating if early productivity around nouns can provide an additional source of noun richness, distributional analysis at several points in MOSAIC model development, and whether distributional analysis are sufficient to explain the pattern of noun richness displayed by children. |
Observation unit: |
Other |
Kind of data: |
Other |
Type of data: |
Experimental data
|
Resource language: |
English |
|
Data sourcing, processing and preparation: |
Questions regarding these materials should be addressed to Daniel Freudenthal (d.freudenthal@liverpool.ac.uk)
|
Rights owners: |
Name |
Affiliation |
ORCID (as URL) |
Freudenthal Daniel |
University of Liverpool |
|
Pine Julian |
University of Liverpool |
|
|
Contact: |
Name | Email | Affiliation | ORCID (as URL) |
---|
Freudenthal, Daniel | d.freudenthal@liverpool.ac.uk | University of Liverpool | Unspecified |
|
Notes on access: |
The Data Collection is available to any user without the requirement for registration for download/access.
|
Publisher: |
UK Data Service
|
Last modified: |
01 Jul 2021 11:11
|
|
Available Files
Data and documentation bundle
Read me
Data collections
Publications
Website
Edit item (login required)
|
Edit Item |