Structured Language

The Experts below are selected from a list of 258 Experts worldwide ranked by ideXlab platform

Frederick Jelinek - One of the best experts on this subject based on the ideXlab platform.

NIPS - Using Random Forests in the Structured Language Model

2004

Co-Authors: Frederick Jelinek

Abstract:

In this paper, we explore the use of Random Forests (RFs) in the Structured Language model (SLM), which uses rich syntactic information in predicting the next word based on words already seen. The goal in this work is to construct RFs by randomly growing Decision Trees (DTs) using syntactic information and investigate the performance of the SLM modeled by the RFs in automatic speech recognition. RFs, which were originally developed as classifiers, are a combination of decision tree classifiers. Each tree is grown based on random training data sampled independently and with the same distribution for all trees in the forest, and a random selection of possible questions at each node of the decision tree. Our approach extends the original idea of RFs to deal with the data sparseness problem encountered in Language modeling. RFs have been studied in the context of n-gram Language modeling and have been shown to generalize well to unseen data. We show in this paper that RFs using syntactic information can also achieve better performance in both perplexity (PPL) and word error rate (WER) in a large vocabulary speech recognition system, compared to a baseline that uses Kneser-Ney smoothing.

15 days free trial to Access Article
exact training of a neural syntactic Language model

International Conference on Acoustics Speech and Signal Processing, 2004

Co-Authors: Ahmad Emami, Frederick Jelinek

Abstract:

The Structured Language model (SLM) aims at predicting the next word in a given word string by making a syntactical analysis of the preceding words. However, it faces the data sparseness problem because of the large dimensionality and diversity of the information available in the syntactic parsing. Previously, we proposed using neural network models for the SLM (Emami, A. et al., Proc. ICASSP, 2003; Emami, Proc. EUROSPEECH'03., 2003). The neural network model is better suited to tackle the data sparseness problem and its use gave significant improvements in perplexity and word error rate over the baseline SLM. We present a new method of training the neural net based SLM. This procedure makes use of the partial parsing hypothesized by the SLM itself, and is more expensive than the approximate training method used previously. Experiments with the new training method on the UPenn and WSJ corpora show significant reductions in perplexity and word error rate, achieving the lowest published results for the given corpora.

15 days free trial to Access Article
Stochastic Analysis of Structured Language Modeling

Mathematical Foundations of Speech and Language Processing, 2004

Co-Authors: Frederick Jelinek

Abstract:

As previously introduced, the Structured Language Model (SLM) operated with the help of a stack from which less probable sub-parse entries were purged before further words were generated. In this article we generalize the CKY algorithm to obtain a chart which allows the direct computation of Language model probabilities thus rendering the stacks unnecessary. An analysis of the behavior of the SLM leads to a generalization of the Inside–Outside algorithm and thus to rigorous EM type re-estimation of the SLM parameters. The derived algorithms are computationally expensive but their demands can be mitigated by use of appropriate thresholding.

15 days free trial to Access Article
training connectionist models for the Structured Language model

Empirical Methods in Natural Language Processing, 2003

Co-Authors: Ahmad Emami, Frederick Jelinek

Abstract:

We investigate the performance of the Structured Language Model (SLM) in terms of perplexity (PPL) when its components are modeled by connectionist models. The connectionist models use a distributed representation of the items in the history and make much better use of contexts than currently used interpolated or back-off models, not only because of the inherent capability of the connectionist model in fighting the data sparseness problem, but also because of the sublinear growth in the model size when the context length is increased. The connectionist models can be further trained by an EM procedure, similar to the previously used procedure for training the SLM. Our experiments show that the connectionist models can significantly improve the PPL over the interpolated and back-off models on the UPENN Treebank corpora, after interpolating with a baseline trigram Language model. The EM training procedure can improve the connectionist models further, by using hidden events obtained by the SLM parser.

15 days free trial to Access Article
EMNLP - Training connectionist models for the Structured Language model

Proceedings of the 2003 conference on Empirical methods in natural language processing -, 2003

Co-Authors: Ahmad Emami, Frederick Jelinek

Abstract:

We investigate the performance of the Structured Language Model (SLM) in terms of perplexity (PPL) when its components are modeled by connectionist models. The connectionist models use a distributed representation of the items in the history and make much better use of contexts than currently used interpolated or back-off models, not only because of the inherent capability of the connectionist model in fighting the data sparseness problem, but also because of the sublinear growth in the model size when the context length is increased. The connectionist models can be further trained by an EM procedure, similar to the previously used procedure for training the SLM. Our experiments show that the connectionist models can significantly improve the PPL over the interpolated and back-off models on the UPENN Treebank corpora, after interpolating with a baseline trigram Language model. The EM training procedure can improve the connectionist models further, by using hidden events obtained by the SLM parser.

15 days free trial to Access Article

Ciprian Chelba - One of the best experts on this subject based on the ideXlab platform.

a study on richer syntactic dependencies for Structured Language modeling

Meeting of the Association for Computational Linguistics, 2002

Co-Authors: Ciprian Chelba, Frederick Jelinek

Abstract:

We study the impact of richer syntactic dependencies on the performance of the Structured Language model (SLM) along three dimensions: parsing accuracy (LP/LR), perplexity (PPL) and word-error-rate (WER, N-best re-scoring). We show that our models achieve an improvement in LP/LR, PPL and/or WER over the reported baseline results using the SLM on the UPenn Treebank and Wall Street Journal (WSJ) corpora, respectively. Analysis of parsing performance shows correlation between the quality of the parser (as measured by precision/recall) and the Language model performance (PPL and WER). A remarkable fact is that the enriched SLM outperforms the baseline 3-gram model in terms of WER by 10% when used in isolation as a second pass (N-best re-scoring) Language model.

15 days free trial to Access Article
Information Extraction Using the Structured Language Model

arXiv: Computation and Language, 2001

Co-Authors: Ciprian Chelba, Milind Mahajan

Abstract:

The paper presents a data-driven approach to information extraction (viewed as template filling) using the Structured Language model (SLM) as a statistical parser. The task of template filling is cast as constrained parsing using the SLM. The model is automatically trained from a set of sentences annotated with frame/slot labels and spans. Training proceeds in stages: first a constrained syntactic parser is trained such that the parses on training data meet the specified semantic spans, then the non-terminal labels are enriched to contain semantic information and finally a constrained syntactic+semantic parser is trained on the parse trees resulting from the previous stage. Despite the small amount of training data used, the model is shown to outperform the slot level accuracy of a simple semantic grammar authored manually for the MiPad --- personal information management --- task.

15 days free trial to Access Article
EMNLP - Information Extraction Using the Structured Language Model.

2001

Co-Authors: Ciprian Chelba, Milind Mahajan

Abstract:

The paper presents a data-driven approach to information extraction (viewed as template lling) using the Structured Language model (SLM) as a statistical parser. The task of template lling is cast as constrained parsing using the SLM. The model is automatically trained from a set of sentences annotated with frame/slot labels and spans. Training proceeds in stages: rst a constrained syntactic parser is trained such that the parses on training data meet the speci ed semantic spans, then the non-terminal labels are enriched to contain semantic information and nally a constrained syntactic+semantic parser is trained on the parse trees resulting from the previous stage. Despite the small amount of training data used, the model is shown to outperform the slot level accuracy of a simple semantic grammar authored manually for the MiPad | personal information management | task.

15 days free trial to Access Article
ACL - A Study on Richer Syntactic Dependencies for Structured Language Modeling

Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL '02, 2001

Co-Authors: Ciprian Chelba, Frederick Jelinek

Abstract:

We study the impact of richer syntactic dependencies on the performance of the Structured Language model (SLM) along three dimensions: parsing accuracy (LP/LR), perplexity (PPL) and word-error-rate (WER, N-best re-scoring). We show that our models achieve an improvement in LP/LR, PPL and/or WER over the reported baseline results using the SLM on the UPenn Treebank and Wall Street Journal (WSJ) corpora, respectively. Analysis of parsing performance shows correlation between the quality of the parser (as measured by precision/recall) and the Language model performance (PPL and WER). A remarkable fact is that the enriched SLM outperforms the baseline 3-gram model in terms of WER by 10% when used in isolation as a second pass (N-best re-scoring) Language model.

15 days free trial to Access Article
Structured Language Modeling for Speech Recognition

arXiv: Computation and Language, 2000

Co-Authors: Ciprian Chelba, Frederick Jelinek

Abstract:

A new Language model for speech recognition is presented. The model develops hidden hierarchical syntactic-like structure incrementally and uses it to extract meaningful information from the word history, thus complementing the locality of currently used trigram models. The Structured Language model (SLM) and its performance in a two-pass speech recognizer --- lattice decoding --- are presented. Experiments on the WSJ corpus show an improvement in both perplexity (PPL) and word error rate (WER) over conventional trigram models.

15 days free trial to Access Article

Ahmad Emami - One of the best experts on this subject based on the ideXlab platform.

exact training of a neural syntactic Language model

International Conference on Acoustics Speech and Signal Processing, 2004

Co-Authors: Ahmad Emami, Frederick Jelinek

Abstract:

The Structured Language model (SLM) aims at predicting the next word in a given word string by making a syntactical analysis of the preceding words. However, it faces the data sparseness problem because of the large dimensionality and diversity of the information available in the syntactic parsing. Previously, we proposed using neural network models for the SLM (Emami, A. et al., Proc. ICASSP, 2003; Emami, Proc. EUROSPEECH'03., 2003). The neural network model is better suited to tackle the data sparseness problem and its use gave significant improvements in perplexity and word error rate over the baseline SLM. We present a new method of training the neural net based SLM. This procedure makes use of the partial parsing hypothesized by the SLM itself, and is more expensive than the approximate training method used previously. Experiments with the new training method on the UPenn and WSJ corpora show significant reductions in perplexity and word error rate, achieving the lowest published results for the given corpora.

15 days free trial to Access Article
training connectionist models for the Structured Language model

Empirical Methods in Natural Language Processing, 2003

Co-Authors: Ahmad Emami, Frederick Jelinek

Abstract:

We investigate the performance of the Structured Language Model (SLM) in terms of perplexity (PPL) when its components are modeled by connectionist models. The connectionist models use a distributed representation of the items in the history and make much better use of contexts than currently used interpolated or back-off models, not only because of the inherent capability of the connectionist model in fighting the data sparseness problem, but also because of the sublinear growth in the model size when the context length is increased. The connectionist models can be further trained by an EM procedure, similar to the previously used procedure for training the SLM. Our experiments show that the connectionist models can significantly improve the PPL over the interpolated and back-off models on the UPENN Treebank corpora, after interpolating with a baseline trigram Language model. The EM training procedure can improve the connectionist models further, by using hidden events obtained by the SLM parser.

15 days free trial to Access Article
EMNLP - Training connectionist models for the Structured Language model

Proceedings of the 2003 conference on Empirical methods in natural language processing -, 2003

Co-Authors: Ahmad Emami, Frederick Jelinek

Abstract:

We investigate the performance of the Structured Language Model (SLM) in terms of perplexity (PPL) when its components are modeled by connectionist models. The connectionist models use a distributed representation of the items in the history and make much better use of contexts than currently used interpolated or back-off models, not only because of the inherent capability of the connectionist model in fighting the data sparseness problem, but also because of the sublinear growth in the model size when the context length is increased. The connectionist models can be further trained by an EM procedure, similar to the previously used procedure for training the SLM. Our experiments show that the connectionist models can significantly improve the PPL over the interpolated and back-off models on the UPENN Treebank corpora, after interpolating with a baseline trigram Language model. The EM training procedure can improve the connectionist models further, by using hidden events obtained by the SLM parser.

15 days free trial to Access Article

Ufuk Topcu - One of the best experts on this subject based on the ideXlab platform.

Counterexamples for Robotic Planning Explained in Structured Language

arXiv: Robotics, 2018

Co-Authors: Lu Feng, Mahsa Ghasemi, Kai-wei Chang, Ufuk Topcu

Abstract:

Automated techniques such as model checking have been used to verify models of robotic mission plans based on Markov decision processes (MDPs) and generate counterexamples that may help diagnose requirement violations. However, such artifacts may be too complex for humans to understand, because existing representations of counterexamples typically include a large number of paths or a complex automaton. To help improve the interpretability of counterexamples, we define a notion of explainable counterexample, which includes a set of Structured natural Language sentences to describe the robotic behavior that lead to a requirement violation in an MDP model of robotic mission plan. We propose an approach based on mixed-integer linear programming for generating explainable counterexamples that are minimal, sound and complete. We demonstrate the usefulness of the proposed approach via a case study of warehouse robots planning.

15 days free trial to Access Article
ICRA - Counterexamples for Robotic Planning Explained in Structured Language

2018 IEEE International Conference on Robotics and Automation (ICRA), 2018

Co-Authors: Lu Feng, Mahsa Ghasemi, Kai-wei Chang, Ufuk Topcu

Abstract:

Automated techniques such as model checking have been used to verify models of robotic mission plans based on Markov decision processes (MDPs) and generate counterexamples that may help diagnose requirement violations. However, such artifacts may be too complex for humans to understand, because existing representations of counterexamples typically include a large number of paths or a complex automaton. To help improve the interpretability of counterexamples, we define a notion of explainable counterexample, which includes a set of Structured natural Language sentences to describe the robotic behavior that lead to a requirement violation in an MDP model of robotic mission plan. We propose an approach based on mixed-integer linear programming for generating explainable counterexamples that are minimal, sound and complete. We demonstrate the usefulness of the proposed approach via a case study of warehouse robots planning.

15 days free trial to Access Article

Richard L. Sparks - One of the best experts on this subject based on the ideXlab platform.

Teaching a foreign Language using multisensory Structured Language techniques to at-risk learners: a review.

Dyslexia (Chichester England), 2000

Co-Authors: Richard L. Sparks, Karen Miller

Abstract:

An overview of multisensory Structured Language (MSL) techniques used to teach a foreign Language to at-risk students is outlined. Research supporting the use of MSL techniques is reviewed. Specific activities using the MSL approach to teach the phonology/orthography, grammar and vocabulary of the foreign Language as well as reading and communicative activities in the foreign Language are presented.

15 days free trial to Access Article
Benefits of multisensory Structured Language instruction for at-risk foreign Language learners: A comparison study of high school Spanish students

Annals of Dyslexia, 1998

Co-Authors: Richard L. Sparks, Karen Miller, Marjorie Artzer, Jon M. Patton, Leonore Ganschow, Dorothy J. Hordubay, Geri Walsh

Abstract:

In this study, the benefits of multisensory Structured Language (MSL) instruction in Spanish were examined. Participants were students in high-school-level Spanish attending girls’ preparatory schools. Of the 55 participants, 39 qualified as at-risk for foreign Language learning difficulties and 16 were deemed not-at-risk. The at-risk students were assigned to one of three conditions: (1) MSL—multisensory Spanish instruction in self-contained classrooms (n=14); (2) SC—traditional Spanish instruction provided in self-contained classrooms (n=11); and (3) NSC—traditional Spanish instruction in regular (not self-contained) Spanish classes (n=14). Not-at-risk students (n=16) received traditional Spanish instruction in regular classes similar to the instruction provided to the NSC group.

15 days free trial to Access Article
The Effects of Multisensory Structured Language Instruction on Native Language and Foreign Language Aptitude Skills of At-Risk High School Foreign Language Learners: A Replication and Follow-up Study

Annals of dyslexia, 1993

Co-Authors: Richard L. Sparks, Leonore Ganschow

Abstract:

According to research findings, most students who experience foreign Language learning problems are thought to have overt or subtle native Language learning difficulties, primarily with phonological processing. A recent study by the authors showed that when a multisensory Structured Language approach to teaching Spanish was used with a group of at-risk high school students, the group’s pre- and posttest scores on native Language phonological processing, verbal memory and vocabulary, and foreign Language aptitude measures significantly improved. In this replication and follow-up study, the authors compared pre- and posttest scores of a second group of students (Cohort 2) who received MSL instruction in Spanish on native Language and foreign Language aptitude measures. They also followed students from the first study (Cohort 1) over a second year of foreign Language instruction. Findings showed that the second cohort made significant gains on three native Language phonological measures and a test of foreign Language aptitude. Follow-up testing on the first cohort showed that the group maintained its initial gains on all native Language and foreign Language aptitude measures. Implications for the authors’ Linguistic Coding Deficit Hypothesis are discussed and linked with current reading research, in particular the concepts of the assumption of specificity and modularity.

15 days free trial to Access Article
The effects of multisensory Structured Language instruction on native Language and foreign Language aptitude skills of at-risk high school foreign Language learners.

Annals of dyslexia, 1992

Co-Authors: Richard L. Sparks, Leonore Ganschow, Jane Pohlman, Sue Skinner, Marjorie Artzer

Abstract:

Research findings suggest that most students who have foreign Language learning problems have Language-based difficulties and, in particular, phonological processing problems. Authors of the present study examined pre- and posttest scores on native Language and foreign Language aptitude tests of three groups of at-risk high school students enrolled in special, self-contained sections of first-year Spanish. Two groups were instructed using a multisensory Structured Language (MSL) approach. One of the groups was taught in both English and Spanish (MSL/ES), the other only in Spanish (MSL/S). The third group (NO-MSL) was instructed using more traditional second Language teaching methodologies. Significant gains were made by the MSL-ES group on measures of native Language phonology, vocabulary, and verbal memory and on a test of foreign Language aptitude; the MSL/S group made significant gains on the test of foreign Language aptitude. No significant gains on the native Language or foreign Language aptitude measures were made by the NO-MSL group. Implications for foreign Language classroom instruction of at-risk students are discussed.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Frederick Jelinek - One of the best experts on this subject based on the ideXlab platform.

NIPS - Using Random Forests in the Structured Language Model

exact training of a neural syntactic Language model

Stochastic Analysis of Structured Language Modeling

training connectionist models for the Structured Language model

EMNLP - Training connectionist models for the Structured Language model

Ciprian Chelba - One of the best experts on this subject based on the ideXlab platform.

a study on richer syntactic dependencies for Structured Language modeling

Information Extraction Using the Structured Language Model

EMNLP - Information Extraction Using the Structured Language Model.

ACL - A Study on Richer Syntactic Dependencies for Structured Language Modeling

Structured Language Modeling for Speech Recognition

Ahmad Emami - One of the best experts on this subject based on the ideXlab platform.

exact training of a neural syntactic Language model

training connectionist models for the Structured Language model

EMNLP - Training connectionist models for the Structured Language model

Ufuk Topcu - One of the best experts on this subject based on the ideXlab platform.

Counterexamples for Robotic Planning Explained in Structured Language

ICRA - Counterexamples for Robotic Planning Explained in Structured Language

Richard L. Sparks - One of the best experts on this subject based on the ideXlab platform.

Teaching a foreign Language using multisensory Structured Language techniques to at-risk learners: a review.

Benefits of multisensory Structured Language instruction for at-risk foreign Language learners: A comparison study of high school Spanish students

The Effects of Multisensory Structured Language Instruction on Native Language and Foreign Language Aptitude Skills of At-Risk High School Foreign Language Learners: A Replication and Follow-up Study

The effects of multisensory Structured Language instruction on native Language and foreign Language aptitude skills of at-risk high school foreign Language learners.

Structured Language

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Frederick Jelinek - One of the best experts on this subject based on the ideXlab platform.

Ciprian Chelba - One of the best experts on this subject based on the ideXlab platform.

Ahmad Emami - One of the best experts on this subject based on the ideXlab platform.

Ufuk Topcu - One of the best experts on this subject based on the ideXlab platform.

Richard L. Sparks - One of the best experts on this subject based on the ideXlab platform.

Related terms