Syntactic Category

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 11067 Experts worldwide ranked by ideXlab platform

Anne Christophe - One of the best experts on this subject based on the ideXlab platform.

  • 18‐month‐olds fail to use recent experience to infer the Syntactic Category of novel words
    Developmental Science, 2021
    Co-Authors: Naomi Havron, Mireille Babineau, Anne Christophe
    Abstract:

    Infants are able to use the contexts in which familiar words appear to guide their inferences about the Syntactic Category of novel words (e.g., “This is a” + “dax” -> dax = object). The current study examined whether 18-month-old infants can rapidly adapt these expectations by tracking the distribution of Syntactic structures in their input. In French, la petite can be followed by both nouns (la petite balle, “the little ball”) and verbs (la petite mange, “the little one is eating”). Infants were habituated to a novel word, as well as to familiar nouns or verbs (depending on the experimental group), all appearing after la petite. The familiar words served to create an expectation that la petite would be followed by either nouns or verbs. If infants can utilize their knowledge of a few frequent words to adjust their expectations, then they could use this information to infer the Syntactic Category of a novel word – and be surprised when the novel word is used in a context that is incongruent with their expectations. However, infants in both groups did not show a difference between noun and verb test trials. Thus, no evidence for adaptation-based learning was found. We propose that infants have to entertain strong expectations about Syntactic contexts before they can adapt these expectations based on recent input.

  • 18-month-olds fail to use recent experience to infer the Syntactic Category of novel words
    2020
    Co-Authors: Naomi Havron, Mireille Babineau, Anne Christophe
    Abstract:

    Infants are able to use the contexts in which familiar words appear to guide their inferences about the Syntactic Category of novel words (e.g., “This is a” + “dax” -> dax = object). The current study examined whether 18-month-old infants can rapidly adapt these expectations by tracking the distribution of Syntactic structures in their input. In French, la petite can be followed by both nouns (la petite balle, “the little ball”) and verbs (la petite mange, “the little one is eating”). Infants were habituated to a novel word, as well as to familiar nouns or verbs (depending on the experimental group), all appearing after la petite. The familiar words served to create an expectation that la petite would be followed by either nouns or verbs. If infants can utilize their knowledge of a few frequent words to adjust their expectations, then they could use this information to infer the Syntactic Category of a novel word – and be surprised when the novel word is used in a context that is incongruent with their expectations. However, infants in both groups did not show a difference between noun and verb test trials. Thus, no evidence for adaptation-based learning was found. We propose that infants have to entertain strong expectations about Syntactic contexts before they can adapt these expectations based on recent input.

  • Three- to Four-Year-Old Children Rapidly Adapt Their Predictions and Use Them to Learn Novel Word Meanings
    Child Development, 2019
    Co-Authors: Naomi Havron, Alex De Carvalho, Annecaroline Fievet, Anne Christophe
    Abstract:

    Adults create and update predictions about what speakers will say next. The current study asks whether prediction can drive language acquisition, by testing whether 3-4-year-old children (n=45) adapt to recent information when learning novel words. The study used a Syntactic context which can precede both nouns and verbs to manipulate children’s predictions about what Syntactic Category will follow. Children for whom the Syntactic context predicted verbs were more likely to infer that a novel word appearing in this context referred to an action, than children for whom it predicted nouns. This suggests that children make rapid changes to their predictions, and use this information to learn novel information, supporting the role of prediction in language acquisition.

  • phrasal prosody constrains Syntactic analysis in toddlers
    Cognition, 2017
    Co-Authors: Alex De Carvalho, Isabelle Dautriche, Isabelle Lin, Anne Christophe
    Abstract:

    This study examined whether phrasal prosody can impact toddlers' Syntactic analysis. French noun-verb homophones were used to create locally ambiguous test sentences (e.g., using the homophone as a noun: [le bebesouris] [a bien mange] - [the baby mouse] [ate well] or using it as a verb: [le bebe] [sourita sa maman] - [the baby] [smiles to his mother], where brackets indicate prosodic phrase boundaries). Although both sentences start with the same words (le-bebe-/suʁi/), they can be disambiguated by the prosodic boundary that either directly precedes the critical word /suʁi/ when it is a verb, or directly follows it when it is a noun. Across two experiments using an intermodal preferential looking procedure, 28-month-olds (Exp. 1 and 2) and 20-month-olds (Exp. 2) listened to the beginnings of these test sentences while watching two images displayed side-by-side on a TV-screen: one associated with the noun interpretation of the ambiguous word (e.g., a mouse) and the other with the verb interpretation (e.g., a baby smiling). The results show that upon hearing the first words of these sentences, toddlers were able to correctly exploit prosodic information to access the Syntactic structure of sentences, which in turn helped them to determine the Syntactic Category of the ambiguous word and to correctly identify its intended meaning: participants switched their eye-gaze toward the correct image based on the prosodic condition in which they heard the ambiguous target word. This provides evidence that during the first steps of language acquisition, toddlers are already able to exploit the prosodic structure of sentences to recover their Syntactic structure and predict the Syntactic Category of upcoming words, an ability which would be extremely useful to discover the meaning of novel words.

  • ambiguous function words do not prevent 18 month olds from building accurate Syntactic Category expectations an erp study
    Neuropsychologia, 2017
    Co-Authors: Alex De Carvalho, Perrine Brusini, Ghislaine Dehaenelambertz, Marieke Van Heugten, Francois Goffinet, Annecaroline Fievet, Anne Christophe
    Abstract:

    To comprehend language, listeners need to encode the relationship between words within sentences. This entails categorizing words into their appropriate word classes. Function words, consistently preceding words from specific categories (e.g., the ballNOUN, I speakVERB), provide invaluable information for this task, and children's sensitivity to such adjacent relationships develops early on in life. However, neighboring words are not the sole source of information regarding an item's word class. Here we examine whether young children also take into account preceding sentence context online during Syntactic categorization. To address this question, we use the ambiguous French function word la which, depending on sentence context, can either be used as determiner (the, preceding nouns) or as object clitic (it, preceding verbs). French-learning 18-month-olds' evoked potentials (ERPs) were recorded while they listened to sentences featuring this ambiguous function word followed by either a noun or a verb (thus yielding a locally felicitous co-occurrence of la + noun or la + verb). Crucially, preceding sentence context rendered the sentence either grammatical or ungrammatical. Ungrammatical sentences elicited a late positivity (resembling a P600) that was not observed for grammatical sentences. Toddlers' analysis of the unfolding sentence was thus not limited to local co-occurrences, but rather took into account non-adjacent sentence context. These findings suggest that by 18 months of age, online word categorization is already surprisingly robust. This could be greatly beneficial for the acquisition of novel words.

Liina Pylkkänen - One of the best experts on this subject based on the ideXlab platform.

  • Left occipital and right frontal involvement in Syntactic Category prediction: MEG evidence from Standard Arabic.
    Neuropsychologia, 2019
    Co-Authors: Suhail Matar, Liina Pylkkänen, Alec Marantz
    Abstract:

    Abstract Though recent years have seen a growth in research on predictive processes in language comprehension, their scope and mechanisms remain partially elusive. While mechanisms involved in predicting specific words are relatively well understood, those underlying Syntactic prediction are still unclear. In part, this is because of the difficulty in designing experiments that manipulate Syntactic predictability while controlling other variables. In this MEG study, we achieved this with a manipulation of Syntactic Category predictability within fully well-formed expressions of Standard Arabic. Participants read sentences beginning with a subject-adjective context, in which the presence of at least one of two possible cues (gender-incongruity and/or an intervening relative pronoun) was sufficient for predicting a target word's Syntactic Category. Absence of both cues (i.e., congruent subject-adjective context with no relative pronoun) increased uncertainty about the target's Syntactic Category. Using source analysis, we compared activity evoked by targets with predictable and unpredictable categories in the occipital lobe. We found an interaction effect consistent with previous findings: in the primary visual cortex, an early evoked component (visual M100) is enhanced only when the Syntactic Category was unpredictable. We also compared responses to pre-target predictive and unpredictive contexts across five bilateral frontal and temporal regions. In the right-hemispheric frontal region, we found a temporal cluster (~230 ms after adjective onset), where unpredictive contexts elicited more activation than predictive contexts. By hypothesis elimination, we conclude that the most likely variable driving this effect is Syntactic entropy. Our results show that predictive mechanisms recruited during reading also involve predicting upcoming Syntactic categories, implicating at least two cortical regions: the left visual cortex and the right frontal cortex.

  • Before the N400: effects of lexical-semantic violations in visual cortex.
    Brain and language, 2011
    Co-Authors: Suzanne Dikker, Liina Pylkkänen
    Abstract:

    There exists an increasing body of research demonstrating that language processing is aided by context-based predictions. Recent findings suggest that the brain generates estimates about the likely physical appearance of upcoming words based on Syntactic predictions: words that do not physically look like the expected Syntactic Category show increased amplitudes in the visual M100 component, the first salient MEG response to visual stimulation. This research asks whether violations of predictions based on lexical-semantic information might similarly generate early visual effects. In a picture-noun matching task, we found early visual effects for words that did not accurately describe the preceding pictures. These results demonstrate that, just like Syntactic predictions, lexical-semantic predictions can affect early visual processing around ∼100ms, suggesting that the M100 response is not exclusively tuned to recognizing visual features relevant to Syntactic Category analysis. Rather, the brain might generate predictions about upcoming visual input whenever it can. However, visual effects of lexical-semantic violations only occurred when a single lexical item could be predicted. We argue that this may be due to the fact that in natural language processing, there is typically no straightforward mapping between lexical-semantic fields (e.g., flowers) and visual or auditory forms (e.g., tulip, rose, magnolia). For Syntactic categories, in contrast, certain form features do reliably correlate with Category membership. This difference may, in part, explain why certain Syntactic effects typically occur much earlier than lexical-semantic effects.

  • Before the N400: Effects of lexical–semantic violations in visual cortex
    Brain and Language, 2011
    Co-Authors: Suzanne Dikker, Liina Pylkkänen
    Abstract:

    Abstract There exists an increasing body of research demonstrating that language processing is aided by context-based predictions. Recent findings suggest that the brain generates estimates about the likely physical appearance of upcoming words based on Syntactic predictions: words that do not physically look like the expected Syntactic Category show increased amplitudes in the visual M100 component, the first salient MEG response to visual stimulation. This research asks whether violations of predictions based on lexical–semantic information might similarly generate early visual effects. In a picture–noun matching task, we found early visual effects for words that did not accurately describe the preceding pictures. These results demonstrate that, just like Syntactic predictions, lexical–semantic predictions can affect early visual processing around ∼100 ms, suggesting that the M100 response is not exclusively tuned to recognizing visual features relevant to Syntactic Category analysis. Rather, the brain might generate predictions about upcoming visual input whenever it can. However, visual effects of lexical–semantic violations only occurred when a single lexical item could be predicted. We argue that this may be due to the fact that in natural language processing, there is typically no straightforward mapping between lexical–semantic fields (e.g., flowers) and visual or auditory forms (e.g., tulip , rose , magnolia ). For Syntactic categories, in contrast, certain form features do reliably correlate with Category membership. This difference may, in part, explain why certain Syntactic effects typically occur much earlier than lexical–semantic effects.

  • Early Occipital Sensitivity to Syntactic Category Is Based on Form Typicality
    Psychological science, 2010
    Co-Authors: Suzanne Dikker, Hugh Rabagliati, Thomas A. Farmer, Liina Pylkkänen
    Abstract:

    Syntactic factors can rapidly affect behavioral and neural responses during language processing; however, the mechanisms that allow this rapid extraction of Syntactically relevant information remain poorly understood. We addressed this issue using magnetoencephalography and found that an unexpected word Category (e.g., “The recently princess . . . ”) elicits enhanced activity in visual cortex as early as 120 ms after exposure, and that this activity occurs as a function of the compatibility of a word’s form with the form properties associated with a predicted word Category. Because no sensitivity to linguistic factors has been previously reported for words in isolation at this stage of visual analysis, we propose that predictions about upcoming Syntactic categories are translated into form-based estimates, which are made available to sensory cortices. This finding may be a key component to elucidating the mechanisms that allow the extreme rapidity and efficiency of language comprehension.

Frank Keller - One of the best experts on this subject based on the ideXlab platform.

  • Adding sentence types to a model of Syntactic Category acquisition.
    Topics in cognitive science, 2013
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    The acquisition of Syntactic categories is a crucial step in the process of acquiring syntax. At this stage, before a full grammar is available, only surface cues are available to the learner. Previous computational models have demonstrated that local contexts are informative for Syntactic categorization. However, local contexts are affected by sentence-level structure. In this paper, we add sentence type as an observed feature to a model of Syntactic Category acquisition, based on experimental evidence showing that pre-Syntactic children are able to distinguish sentence type using prosody and other cues. The model, a Bayesian Hidden Markov Model, allows for adding sentence type in a few different ways; we find that sentence type can aid Syntactic Category acquisition if it is used to characterize the differences in word order between sentence types. In these models, knowledge of sentence type permits similar gains to those found by extending the local context.

  • using sentence type information for Syntactic Category acquisition
    Meeting of the Association for Computational Linguistics, 2010
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    In this paper we investigate a new source of information for Syntactic Category acquisition: sentence type (question, declarative, imperative). Sentence type correlates strongly with intonation patterns in most languages; we hypothesize that these intonation patterns are a valuable signal to a language learner, indicating different Syntactic patterns. To test this hypothesis, we train a Bayesian Hidden Markov Model (and variants) on child-directed speech. We first show that simply training a separate model for each sentence type decreases performance due to sparse data. As an alternative, we propose two new models based on the BHMM in which sentence type is an observed variable which influences either emission or transition probabilities. Both models outperform a standard BHMM on data from English, Cantonese, and Dutch. This suggests that sentence type information available from intonational cues may be helpful for Syntactic acquisition cross-linguistically.

  • CMCL@ACL - Using Sentence Type Information for Syntactic Category Acquisition
    2010
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    In this paper we investigate a new source of information for Syntactic Category acquisition: sentence type (question, declarative, imperative). Sentence type correlates strongly with intonation patterns in most languages; we hypothesize that these intonation patterns are a valuable signal to a language learner, indicating different Syntactic patterns. To test this hypothesis, we train a Bayesian Hidden Markov Model (and variants) on child-directed speech. We first show that simply training a separate model for each sentence type decreases performance due to sparse data. As an alternative, we propose two new models based on the BHMM in which sentence type is an observed variable which influences either emission or transition probabilities. Both models outperform a standard BHMM on data from English, Cantonese, and Dutch. This suggests that sentence type information available from intonational cues may be helpful for Syntactic acquisition cross-linguistically.

  • Evaluating Models of Syntactic Category Acquisition without Using a Gold Standard
    2009
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    Evaluating Models of Syntactic Category Acquisition without Using a Gold Standard Stella Frank (s.c.frank@sms.ed.ac.uk) and Sharon Goldwater (sgwater@inf.ed.ac.uk) and Frank Keller (keller@inf.ed.ac.uk) School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB, UK Abstract A number of different measures have been proposed for eval- uating computational models of human Syntactic Category ac- quisition. They all rely on a gold standard set of manually de- termined categories. However, children’s Syntactic categories change during language development, so evaluating against a fixed and final set of adult categories is not appropriate. In this paper, we propose a new measure, substitutable precision and recall, based on the idea that words which occur in similar Syntactic environments share the same Category. We use this measure to evaluate three standard Category acquisition mod- els (hierarchical clustering, frequent frames, Bayesian HMM) and show that the results correlate well with those obtained using two gold-standard-based measures. Introduction By the time children reach school age, they have achieved the remarkable feat of acquiring most of their native language, typically without explicit instruction. This includes the ac- quisition of Syntactic categories (noun, verb, adjective, etc.). A number of computational models of Category learning have been developed, most of which conceptualize the problem as one of grouping together words whose Syntactic behavior is similar. Typically, the input for the model is taken from a cor- pus of child-directed speech, and clusters are created based on distributional information (Redington et al., 1998; Mintz, 2003; Parisien et al., 2008). A problem common to all existing models is the evaluation of the model clusters. Often researchers have tested the output of their models against gold-standard Category assignments, such as that available in the CHILDES database (MacWhin- ney, 2000). These gold-standard categories are based on the intuition of human annotators and are representative of adult morphoSyntactic knowledge. Therefore, this type of evalua- tion is not ideal for assessing the Syntactic categories of chil- dren, as these may include linguistically valid distinctions not recognized by the gold standard. Conversely, the gold stan- dard may make distinctions that children do not have, or only acquire during language development. For example, at the age of two, English-learning children have not fully acquired the verb Category (Olguin & Tomasello, 1993), and functional categories such as determiners are acquired even later (Kemp et al., 2005). It is therefore highly desirable to develop an evaluation measure that does not make reference to an (adult) gold stan- dard. On the other hand, the measure should give results that correlate with gold-standard-based measures, indicating that it is capable of capturing the linguistic distinctions inherent in the gold-standard. Finally, the ideal measure needs to be applicable to a wide range of different acquisition models (e.g., it should not be limited to probabilistic models). This paper proposes a new evaluation measure which meets these criteria: substitutable precision and recall. It relies on a classical idea from linguistics, viz., that words which share the same Syntactic Category occur in similar Syntactic envi- ronments. It does not require a gold standard, and therefore is suitable for evaluating pre-adult categories. At the same time, it yields results that correlate with gold-standard-based mea- sures. We will show this by applying our new measure, as well as existing measures, to three standard models that dis- cover Syntactic categories in child-directed speech. This is the first time these models have been systematically compared; previous authors have used their own evaluation measures and only applied them to their own data sets, thus making a com- parison across models difficult. Gold-standard-based Evaluation Measures In the following section we describe two evaluation measures that have been used to evaluate Category acquisition models. Both require gold-standard labeled data, which is problem- atic from an acquisition standpoint for the reasons previously discussed. Hand-labeled data is also scarce, particularly for languages other than English. Some of the models we investigate categorize word types (a type being a word such as duck), whereas others categorize tokens (particular instances of duck). In order to compare both kinds of models, the measures we describe are used to score tokens, not types. Matched Accuracy This measure is widely used in the field of Natural Language Processing for unsupervised part- of-speech tagging, in which the tokens of a text are automat- ically annotated (“tagged”) with cluster numbers. To obtain the matched accuracy MA, the clusters induced by the model are mapped onto the gold-standard categories in order to pro- vide a gold-standard part-of-speech label for each cluster. MA is then defined as the percentage of word tokens with correct Category labels. The crucial aspect is the mapping between the clusters and the gold standard categories. In this paper, we use many-to-one accuracy, where each model cluster is matched onto the gold-standard Category with which it shares the most tokens. This can result in a situation where multiple clusters are mapped onto the same gold standard Category. This means the model is not penalized for creating more fine- grained clusters than the gold standard.

Stella Frank - One of the best experts on this subject based on the ideXlab platform.

  • Bayesian models of Syntactic Category acquisition
    2013
    Co-Authors: Stella Frank
    Abstract:

    Discovering a word’s part of speech is an essential step in acquiring the grammar of a language. In this thesis we examine a variety of computational Bayesian models that use linguistic input available to children, in the form of transcribed child directed speech, to learn part of speech categories. Part of speech categories are characterised by contextual (distributional/Syntactic) and word-internal (morphological) similarity. In this thesis, we assume language learners will be aware of these types of cues, and investigate exactly how they can make use of them. Firstly, we enrich the context of a standard model (the Bayesian Hidden Markov Model) by adding sentence type to the wider distributional context. We show that children are exposed to a much more diverse set of sentence types than evident in standard corpora used for NLP tasks, and previous work suggests that they are aware of the differences between sentence type as signalled by prosody and pragmatics. Sentence type affects local context distributions, and as such can be informative when relying on local context for categorisation. Adding sentence types to the model improves performance, depending on how it is integrated into our models. We discuss how to incorporate novel features into the model structure we use in a flexible manner, and present a second model type that learns to use sentence type as a distinguishing cue only when it is informative. Secondly, we add a model of morphological segmentation to the part of speech categorisation model, in order to model joint learning of Syntactic categories and morphology. These two tasks are closely linked: categorising words into Syntactic categories is aided by morphological information, and finding morphological patterns in words is aided by knowing the Syntactic categories of those words. In our joint model, we find improved performance vis-a-vis single-task baselines, but the nature of the improvement depends on the morphological typology of the language being modelled. This is the first token-based joint model of unsupervised morphology and part of speech Category learning of which we are aware.

  • Adding sentence types to a model of Syntactic Category acquisition.
    Topics in cognitive science, 2013
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    The acquisition of Syntactic categories is a crucial step in the process of acquiring syntax. At this stage, before a full grammar is available, only surface cues are available to the learner. Previous computational models have demonstrated that local contexts are informative for Syntactic categorization. However, local contexts are affected by sentence-level structure. In this paper, we add sentence type as an observed feature to a model of Syntactic Category acquisition, based on experimental evidence showing that pre-Syntactic children are able to distinguish sentence type using prosody and other cues. The model, a Bayesian Hidden Markov Model, allows for adding sentence type in a few different ways; we find that sentence type can aid Syntactic Category acquisition if it is used to characterize the differences in word order between sentence types. In these models, knowledge of sentence type permits similar gains to those found by extending the local context.

  • using sentence type information for Syntactic Category acquisition
    Meeting of the Association for Computational Linguistics, 2010
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    In this paper we investigate a new source of information for Syntactic Category acquisition: sentence type (question, declarative, imperative). Sentence type correlates strongly with intonation patterns in most languages; we hypothesize that these intonation patterns are a valuable signal to a language learner, indicating different Syntactic patterns. To test this hypothesis, we train a Bayesian Hidden Markov Model (and variants) on child-directed speech. We first show that simply training a separate model for each sentence type decreases performance due to sparse data. As an alternative, we propose two new models based on the BHMM in which sentence type is an observed variable which influences either emission or transition probabilities. Both models outperform a standard BHMM on data from English, Cantonese, and Dutch. This suggests that sentence type information available from intonational cues may be helpful for Syntactic acquisition cross-linguistically.

  • CMCL@ACL - Using Sentence Type Information for Syntactic Category Acquisition
    2010
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    In this paper we investigate a new source of information for Syntactic Category acquisition: sentence type (question, declarative, imperative). Sentence type correlates strongly with intonation patterns in most languages; we hypothesize that these intonation patterns are a valuable signal to a language learner, indicating different Syntactic patterns. To test this hypothesis, we train a Bayesian Hidden Markov Model (and variants) on child-directed speech. We first show that simply training a separate model for each sentence type decreases performance due to sparse data. As an alternative, we propose two new models based on the BHMM in which sentence type is an observed variable which influences either emission or transition probabilities. Both models outperform a standard BHMM on data from English, Cantonese, and Dutch. This suggests that sentence type information available from intonational cues may be helpful for Syntactic acquisition cross-linguistically.

  • Evaluating Models of Syntactic Category Acquisition without Using a Gold Standard
    2009
    Co-Authors: Stella Frank, Sharon Goldwater, Frank Keller
    Abstract:

    Evaluating Models of Syntactic Category Acquisition without Using a Gold Standard Stella Frank (s.c.frank@sms.ed.ac.uk) and Sharon Goldwater (sgwater@inf.ed.ac.uk) and Frank Keller (keller@inf.ed.ac.uk) School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB, UK Abstract A number of different measures have been proposed for eval- uating computational models of human Syntactic Category ac- quisition. They all rely on a gold standard set of manually de- termined categories. However, children’s Syntactic categories change during language development, so evaluating against a fixed and final set of adult categories is not appropriate. In this paper, we propose a new measure, substitutable precision and recall, based on the idea that words which occur in similar Syntactic environments share the same Category. We use this measure to evaluate three standard Category acquisition mod- els (hierarchical clustering, frequent frames, Bayesian HMM) and show that the results correlate well with those obtained using two gold-standard-based measures. Introduction By the time children reach school age, they have achieved the remarkable feat of acquiring most of their native language, typically without explicit instruction. This includes the ac- quisition of Syntactic categories (noun, verb, adjective, etc.). A number of computational models of Category learning have been developed, most of which conceptualize the problem as one of grouping together words whose Syntactic behavior is similar. Typically, the input for the model is taken from a cor- pus of child-directed speech, and clusters are created based on distributional information (Redington et al., 1998; Mintz, 2003; Parisien et al., 2008). A problem common to all existing models is the evaluation of the model clusters. Often researchers have tested the output of their models against gold-standard Category assignments, such as that available in the CHILDES database (MacWhin- ney, 2000). These gold-standard categories are based on the intuition of human annotators and are representative of adult morphoSyntactic knowledge. Therefore, this type of evalua- tion is not ideal for assessing the Syntactic categories of chil- dren, as these may include linguistically valid distinctions not recognized by the gold standard. Conversely, the gold stan- dard may make distinctions that children do not have, or only acquire during language development. For example, at the age of two, English-learning children have not fully acquired the verb Category (Olguin & Tomasello, 1993), and functional categories such as determiners are acquired even later (Kemp et al., 2005). It is therefore highly desirable to develop an evaluation measure that does not make reference to an (adult) gold stan- dard. On the other hand, the measure should give results that correlate with gold-standard-based measures, indicating that it is capable of capturing the linguistic distinctions inherent in the gold-standard. Finally, the ideal measure needs to be applicable to a wide range of different acquisition models (e.g., it should not be limited to probabilistic models). This paper proposes a new evaluation measure which meets these criteria: substitutable precision and recall. It relies on a classical idea from linguistics, viz., that words which share the same Syntactic Category occur in similar Syntactic envi- ronments. It does not require a gold standard, and therefore is suitable for evaluating pre-adult categories. At the same time, it yields results that correlate with gold-standard-based mea- sures. We will show this by applying our new measure, as well as existing measures, to three standard models that dis- cover Syntactic categories in child-directed speech. This is the first time these models have been systematically compared; previous authors have used their own evaluation measures and only applied them to their own data sets, thus making a com- parison across models difficult. Gold-standard-based Evaluation Measures In the following section we describe two evaluation measures that have been used to evaluate Category acquisition models. Both require gold-standard labeled data, which is problem- atic from an acquisition standpoint for the reasons previously discussed. Hand-labeled data is also scarce, particularly for languages other than English. Some of the models we investigate categorize word types (a type being a word such as duck), whereas others categorize tokens (particular instances of duck). In order to compare both kinds of models, the measures we describe are used to score tokens, not types. Matched Accuracy This measure is widely used in the field of Natural Language Processing for unsupervised part- of-speech tagging, in which the tokens of a text are automat- ically annotated (“tagged”) with cluster numbers. To obtain the matched accuracy MA, the clusters induced by the model are mapped onto the gold-standard categories in order to pro- vide a gold-standard part-of-speech label for each cluster. MA is then defined as the percentage of word tokens with correct Category labels. The crucial aspect is the mapping between the clusters and the gold standard categories. In this paper, we use many-to-one accuracy, where each model cluster is matched onto the gold-standard Category with which it shares the most tokens. This can result in a situation where multiple clusters are mapped onto the same gold standard Category. This means the model is not penalized for creating more fine- grained clusters than the gold standard.

Alex De Carvalho - One of the best experts on this subject based on the ideXlab platform.

  • Three- to Four-Year-Old Children Rapidly Adapt Their Predictions and Use Them to Learn Novel Word Meanings
    Child Development, 2019
    Co-Authors: Naomi Havron, Alex De Carvalho, Annecaroline Fievet, Anne Christophe
    Abstract:

    Adults create and update predictions about what speakers will say next. The current study asks whether prediction can drive language acquisition, by testing whether 3-4-year-old children (n=45) adapt to recent information when learning novel words. The study used a Syntactic context which can precede both nouns and verbs to manipulate children’s predictions about what Syntactic Category will follow. Children for whom the Syntactic context predicted verbs were more likely to infer that a novel word appearing in this context referred to an action, than children for whom it predicted nouns. This suggests that children make rapid changes to their predictions, and use this information to learn novel information, supporting the role of prediction in language acquisition.

  • phrasal prosody constrains Syntactic analysis in toddlers
    Cognition, 2017
    Co-Authors: Alex De Carvalho, Isabelle Dautriche, Isabelle Lin, Anne Christophe
    Abstract:

    This study examined whether phrasal prosody can impact toddlers' Syntactic analysis. French noun-verb homophones were used to create locally ambiguous test sentences (e.g., using the homophone as a noun: [le bebesouris] [a bien mange] - [the baby mouse] [ate well] or using it as a verb: [le bebe] [sourita sa maman] - [the baby] [smiles to his mother], where brackets indicate prosodic phrase boundaries). Although both sentences start with the same words (le-bebe-/suʁi/), they can be disambiguated by the prosodic boundary that either directly precedes the critical word /suʁi/ when it is a verb, or directly follows it when it is a noun. Across two experiments using an intermodal preferential looking procedure, 28-month-olds (Exp. 1 and 2) and 20-month-olds (Exp. 2) listened to the beginnings of these test sentences while watching two images displayed side-by-side on a TV-screen: one associated with the noun interpretation of the ambiguous word (e.g., a mouse) and the other with the verb interpretation (e.g., a baby smiling). The results show that upon hearing the first words of these sentences, toddlers were able to correctly exploit prosodic information to access the Syntactic structure of sentences, which in turn helped them to determine the Syntactic Category of the ambiguous word and to correctly identify its intended meaning: participants switched their eye-gaze toward the correct image based on the prosodic condition in which they heard the ambiguous target word. This provides evidence that during the first steps of language acquisition, toddlers are already able to exploit the prosodic structure of sentences to recover their Syntactic structure and predict the Syntactic Category of upcoming words, an ability which would be extremely useful to discover the meaning of novel words.

  • ambiguous function words do not prevent 18 month olds from building accurate Syntactic Category expectations an erp study
    Neuropsychologia, 2017
    Co-Authors: Alex De Carvalho, Perrine Brusini, Ghislaine Dehaenelambertz, Marieke Van Heugten, Francois Goffinet, Annecaroline Fievet, Anne Christophe
    Abstract:

    To comprehend language, listeners need to encode the relationship between words within sentences. This entails categorizing words into their appropriate word classes. Function words, consistently preceding words from specific categories (e.g., the ballNOUN, I speakVERB), provide invaluable information for this task, and children's sensitivity to such adjacent relationships develops early on in life. However, neighboring words are not the sole source of information regarding an item's word class. Here we examine whether young children also take into account preceding sentence context online during Syntactic categorization. To address this question, we use the ambiguous French function word la which, depending on sentence context, can either be used as determiner (the, preceding nouns) or as object clitic (it, preceding verbs). French-learning 18-month-olds' evoked potentials (ERPs) were recorded while they listened to sentences featuring this ambiguous function word followed by either a noun or a verb (thus yielding a locally felicitous co-occurrence of la + noun or la + verb). Crucially, preceding sentence context rendered the sentence either grammatical or ungrammatical. Ungrammatical sentences elicited a late positivity (resembling a P600) that was not observed for grammatical sentences. Toddlers' analysis of the unfolding sentence was thus not limited to local co-occurrences, but rather took into account non-adjacent sentence context. These findings suggest that by 18 months of age, online word categorization is already surprisingly robust. This could be greatly beneficial for the acquisition of novel words.

  • English-speaking preschoolers can use phrasal prosody for Syntactic parsing
    Journal of the Acoustical Society of America, 2016
    Co-Authors: Alex De Carvalho, Jeffrey Lidz, Lyn Tieu, Tonia Bleam, Anne Christophe
    Abstract:

    This study tested American preschoolers' ability to use phrasal prosody to constrain their Syntactic analysis of locally ambiguous sentences containing noun/verb homophones (e.g., [The baby flies] [hide in the shadows] vs [The baby] [flies his kite], brackets indicate pro-sodic boundaries). The words following the homophone were masked, such that prosodic cues were the only disambiguating information. In an oral completion task, 4-to 5-year-olds successfully exploited the sen-tence's prosodic structure to assign the appropriate Syntactic Category to the target word, mirroring previous results in French (but challenging previous English-language results) and providing cross-linguistic evidence for the role of phrasal prosody in children's Syntactic analysis.

  • preschoolers use phrasal prosody online to constrain Syntactic analysis
    Developmental Science, 2016
    Co-Authors: Alex De Carvalho, Isabelle Dautriche, Anne Christophe
    Abstract:

    Two experiments were conducted to investigate whether young children are able to take into account phrasal prosody when computing the Syntactic structure of a sentence. Pairs of French noun/verb homophones were selected to create locally ambiguous sentences ([la petite ferme] [est tres jolie] 'the small farm is very nice' vs. [la petite] [ferme la fenetre] 'the little girl closes the window'--brackets indicate prosodic boundaries). Although these sentences start with the same three words, ferme is a noun (farm) in the former but a verb (to close) in the latter case. The only difference between these sentence beginnings is the prosodic structure, that reflects the Syntactic structure (with a prosodic boundary just before the critical word when it is a verb, and just after it when it is a noun). Crucially, all words following the homophone were masked, such that prosodic cues were the only disambiguating information. Children successfully exploited prosodic information to assign the appropriate Syntactic Category to the target word, in both an oral completion task (4.5-year-olds, Experiment 1) and in a preferential looking paradigm with an eye-tracker (3.5-year-olds and 4.5-year-olds, Experiment 2). These results show that both groups of children exploit the position of a word within the prosodic structure when computing its Syntactic Category. In other words, even younger children of 3.5 years old exploit phrasal prosody online to constrain their Syntactic analysis. This ability to exploit phrasal prosody to compute Syntactic structure may help children parse sentences containing unknown words, and facilitate the acquisition of word meanings.