The Experts below are selected from a list of 2151 Experts worldwide ranked by ideXlab platform
Stephen Clark - One of the best experts on this subject based on the ideXlab platform.
-
cambridge parser evaluation using textual entailment by Grammatical Relation comparison
Meeting of the Association for Computational Linguistics, 2010Co-Authors: Laura Rimell, Stephen ClarkAbstract:This paper describes the Cambridge submission to the SemEval-2010 Parser Evaluation using Textual Entailment (PETE) task. We used a simple definition of entailment, parsing both T and H with the c&c parser and checking whether the core Grammatical Relations (subject and object) produced for H were a subset of those for T. This simple system achieved the top score for the task out of those systems submitted. We analyze the errors made by the system and the potential role of the task in parser evaluation.
-
SemEval@ACL - Cambridge: Parser Evaluation Using Textual Entailment by Grammatical Relation Comparison
2010Co-Authors: Laura Rimell, Stephen ClarkAbstract:This paper describes the Cambridge submission to the SemEval-2010 Parser Evaluation using Textual Entailment (PETE) task. We used a simple definition of entailment, parsing both T and H with the c&c parser and checking whether the core Grammatical Relations (subject and object) produced for H were a subset of those for T. This simple system achieved the top score for the task out of those systems submitted. We analyze the errors made by the system and the potential role of the task in parser evaluation.
Arry Akhmad Arman - One of the best experts on this subject based on the ideXlab platform.
-
O-COCOSDA/CASLRE - Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad Arman, KuspriyantoAbstract:In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.
-
Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad ArmanAbstract:In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.
Kenji Sagae - One of the best experts on this subject based on the ideXlab platform.
-
LREC - GENIA-GR: a Grammatical Relation Corpus for Parser Evaluation in the Biomedical Domain.
2020Co-Authors: Yuka Tateisi, Yusuke Miyao, Kenji Sagae, Junichi TsujiiAbstract:We report the construction of a corpus for parser evaluation in the biomedical domain. A 50-abstract subset (492 sentences) of the GENIA corpus (Kim et al., 2003) is annotated with labeled head-dependent Relations using the Grammatical Relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain.
-
IWPT - Combining Rule-based and Data-driven Techniques for Grammatical Relation Extraction in Spoken Language.
2020Co-Authors: Kenji Sagae, Alon LavieAbstract:We investigate an aspect of the Relationship between parsing and corpus-based methods in NLP that has received relatively little attention: coverage augmentation in rule-based parsers. In the specific task of determining Grammatical Relations (such as subjects and objects) in transcribed spoken language, we show that a combination of rule-based and corpus-based approaches, where a rule-based system is used as the teacher (or an automatic data annotator) to a corpus-based system, outperforms either system in isolation.
-
Morphosyntactic annotation of CHILDES transcripts
Journal of Child Language, 2010Co-Authors: Kenji Sagae, Alon Lavie, Eric Davis, Brian Macwhinney, Shuly WintnerAbstract:Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of Grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database with Grammatical Relations in the form of labeled dependency structures. We have produced a corpus of over 18,800 utterances (approximately 65,000 words) with manually curated gold-standard Grammatical Relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for the English CHILDES data, which we used to automatically annotate the remainder of the English section of CHILDES. We have also extended the parser to Spanish, and are currently working on supporting more languages. The parser and the manually and automatically annotated data are freely available for research purposes.
-
genia gr a Grammatical Relation corpus for parser evaluation in the biomedical domain
Language Resources and Evaluation, 2008Co-Authors: Yuka Tateisi, Yusuke Miyao, Kenji Sagae, Junichi TsujiiAbstract:We report the construction of a corpus for parser evaluation in the biomedical domain. A 50-abstract subset (492 sentences) of the GENIA corpus (Kim et al., 2003) is annotated with labeled head-dependent Relations using the Grammatical Relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain.
-
High-accuracy Annotation and Parsing of CHILDES Transcripts
Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition - CACLA '07, 2007Co-Authors: Kenji Sagae, Alon Lavie, Eric Davis, Brian Macwhinney, Shuly WintnerAbstract:Corpora of child language are essential for psycholinguistic research. Linguistic annotation of the corpora provides researchers with better means for exploring the development of Grammatical constructions and their usage. We describe an ongoing project that aims to annotate the English section of the CHILDES database with Grammatical Relations in the form of labeled dependency structures. To date, we have produced a corpus of over 65,000 words with manually curated gold-standard Grammatical Relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for English CHILDES data. The parser and the manually annotated data are freely available for research purposes.
Herry Sujaini - One of the best experts on this subject based on the ideXlab platform.
-
O-COCOSDA/CASLRE - Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad Arman, KuspriyantoAbstract:In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.
-
Extended word similarity based clustering on unsupervised PoS induction to improve English-Indonesian statistical machine translation
2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA CASLRE), 2013Co-Authors: Herry Sujaini, Ayu Purwarianti, Arry Akhmad ArmanAbstract:In this paper, we present the unsupervised Part-of-Speech (PoS) induction algorithm to improve translations quality on statistical machine translation. The proposed algorithm is an extension of the algorithm Word-Similarity-Based (WSB) clustering. In the clustering, the similarity between words is measured by its Grammatical Relation with other words. The Grammatical Relation is represented as the n-gram Relation. We extend the WSB clustering by take into account for the previous words in measuring the Grammatical Relation. The clustering results are then used in the English-Indonesia statistical machine translation. The experiments were conducted using MOSES as the machine translation decoder, and were evaluated by its BLEU score. Using 14.000 English-Indonesian sentence pairs, the clustering improved the BLEU score of 2.07%.
Laura Rimell - One of the best experts on this subject based on the ideXlab platform.
-
cambridge parser evaluation using textual entailment by Grammatical Relation comparison
Meeting of the Association for Computational Linguistics, 2010Co-Authors: Laura Rimell, Stephen ClarkAbstract:This paper describes the Cambridge submission to the SemEval-2010 Parser Evaluation using Textual Entailment (PETE) task. We used a simple definition of entailment, parsing both T and H with the c&c parser and checking whether the core Grammatical Relations (subject and object) produced for H were a subset of those for T. This simple system achieved the top score for the task out of those systems submitted. We analyze the errors made by the system and the potential role of the task in parser evaluation.
-
SemEval@ACL - Cambridge: Parser Evaluation Using Textual Entailment by Grammatical Relation Comparison
2010Co-Authors: Laura Rimell, Stephen ClarkAbstract:This paper describes the Cambridge submission to the SemEval-2010 Parser Evaluation using Textual Entailment (PETE) task. We used a simple definition of entailment, parsing both T and H with the c&c parser and checking whether the core Grammatical Relations (subject and object) produced for H were a subset of those for T. This simple system achieved the top score for the task out of those systems submitted. We analyze the errors made by the system and the potential role of the task in parser evaluation.