Grammatical Relation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 2151 Experts worldwide ranked by ideXlab platform

Stephen Clark - One of the best experts on this subject based on the ideXlab platform.

Arry Akhmad Arman - One of the best experts on this subject based on the ideXlab platform.

  • O-COCOSDA/CASLRE - Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
    2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013
    Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad Arman, Kuspriyanto
    Abstract:

    In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.

  • Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
    2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013
    Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad Arman
    Abstract:

    In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.

Kenji Sagae - One of the best experts on this subject based on the ideXlab platform.

  • LREC - GENIA-GR: a Grammatical Relation Corpus for Parser Evaluation in the Biomedical Domain.
    2020
    Co-Authors: Yuka Tateisi, Yusuke Miyao, Kenji Sagae, Junichi Tsujii
    Abstract:

    We report the construction of a corpus for parser evaluation in the biomedical domain. A 50-abstract subset (492 sentences) of the GENIA corpus (Kim et al., 2003) is annotated with labeled head-dependent Relations using the Grammatical Relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain.

  • IWPT - Combining Rule-based and Data-driven Techniques for Grammatical Relation Extraction in Spoken Language.
    2020
    Co-Authors: Kenji Sagae, Alon Lavie
    Abstract:

    We investigate an aspect of the Relationship between parsing and corpus-based methods in NLP that has received relatively little attention: coverage augmentation in rule-based parsers. In the specific task of determining Grammatical Relations (such as subjects and objects) in transcribed spoken language, we show that a combination of rule-based and corpus-based approaches, where a rule-based system is used as the teacher (or an automatic data annotator) to a corpus-based system, outperforms either system in isolation.

  • Morphosyntactic annotation of CHILDES transcripts
    Journal of Child Language, 2010
    Co-Authors: Kenji Sagae, Alon Lavie, Eric Davis, Brian Macwhinney, Shuly Wintner
    Abstract:

    Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of Grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database with Grammatical Relations in the form of labeled dependency structures. We have produced a corpus of over 18,800 utterances (approximately 65,000 words) with manually curated gold-standard Grammatical Relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for the English CHILDES data, which we used to automatically annotate the remainder of the English section of CHILDES. We have also extended the parser to Spanish, and are currently working on supporting more languages. The parser and the manually and automatically annotated data are freely available for research purposes.

  • genia gr a Grammatical Relation corpus for parser evaluation in the biomedical domain
    Language Resources and Evaluation, 2008
    Co-Authors: Yuka Tateisi, Yusuke Miyao, Kenji Sagae, Junichi Tsujii
    Abstract:

    We report the construction of a corpus for parser evaluation in the biomedical domain. A 50-abstract subset (492 sentences) of the GENIA corpus (Kim et al., 2003) is annotated with labeled head-dependent Relations using the Grammatical Relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain.

  • High-accuracy Annotation and Parsing of CHILDES Transcripts
    Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition - CACLA '07, 2007
    Co-Authors: Kenji Sagae, Alon Lavie, Eric Davis, Brian Macwhinney, Shuly Wintner
    Abstract:

    Corpora of child language are essential for psycholinguistic research. Linguistic annotation of the corpora provides researchers with better means for exploring the development of Grammatical constructions and their usage. We describe an ongoing project that aims to annotate the English section of the CHILDES database with Grammatical Relations in the form of labeled dependency structures. To date, we have produced a corpus of over 65,000 words with manually curated gold-standard Grammatical Relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for English CHILDES data. The parser and the manually annotated data are freely available for research purposes.

Herry Sujaini - One of the best experts on this subject based on the ideXlab platform.

  • O-COCOSDA/CASLRE - Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
    2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013
    Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad Arman, Kuspriyanto
    Abstract:

    In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.

  • Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
    2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013
    Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad Arman
    Abstract:

    In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.

Laura Rimell - One of the best experts on this subject based on the ideXlab platform.