Language Modeling

The Experts below are selected from a list of 135492 Experts worldwide ranked by ideXlab platform

Tomas Mikolov - One of the best experts on this subject based on the ideXlab platform.

one billion word benchmark for measuring progress in statistical Language Modeling

arXiv: Computation and Language, 2013

Co-Authors: Ciprian Chelba, Tomas Mikolov, Thorsten Brants, Mike Schuster, Phillipp Koehn, Tony Robinson

Abstract:

We propose a new benchmark corpus to be used for measuring progress in statistical Language Modeling. With almost one billion words of training data, we hope this benchmark will be useful to quickly evaluate novel Language Modeling techniques, and to compare their contribution when combined with other advanced techniques. We show performance of several well-known types of Language models, with the best results achieved with a recurrent neural network based Language model. The baseline unpruned Kneser-Ney 5-gram model achieves perplexity 67.6; a combination of techniques leads to 35% reduction in perplexity, or 10% reduction in cross-entropy (bits), over that baseline. The benchmark is available as a code.google.com project; besides the scripts needed to rebuild the training/held-out data, it also makes available log-probability values for each word in each of ten held-out data sets, for each of the baseline n-gram models.

15 days free trial to Access Article
empirical evaluation and combination of advanced Language Modeling techniques

Conference of the International Speech Communication Association, 2011

Co-Authors: Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, Jan Cernocký

Abstract:

We present results obtained with several advanced Language Modeling techniques, including class based model, cache model, maximum entropy model, structured Language model, random forest Language model and several types of neural network based Language models. We show results obtained after combining all these models by using linear interpolation. We conclude that for both small and moderately sized tasks, we obtain new state of the art results with combination of models, that is significantly better than performance of any individual model. Obtained perplexity reductions against Good-Turing trigram baseline are over 50% and against modified Kneser-Ney smoothed 5-gram over 40%. Index Terms: Language Modeling, neural networks, model combination, speech recognition

15 days free trial to Access Article
RNNLM --- Recurrent Neural Network Language Modeling Toolkit

Proceedings of ASRU 2011, 2011

Co-Authors: Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, Jan Honza Černocký

Abstract:

—We present freely available open-source toolkit for training recurrent neural network based Language models. It can be easily used to improve existing speech recognition and machine translation systems. Also, it can be used as a baseline for future research of advanced Language Modeling techniques. In the paper, we discuss optimal parameter selection and different modes of functionality. The toolkit, example scripts and basic setups are freely available at http://rnnlm.sourceforge.net/.

15 days free trial to Access Article
recurrent neural network based Language Modeling in meeting recognition

Conference of the International Speech Communication Association, 2011

Co-Authors: Stefan Kombrink, Martin Karafiat, Tomas Mikolov, Lukas Burget

Abstract:

We use recurrent neural network (RNN) based Language models to improve the BUT English meeting recognizer. On the baseline setup using the original Language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and Language model adaptation. When n-gram Language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings. Index Terms: automatic speech recognition, Language Modeling, recurrent neural networks, rescoring, adaptation

15 days free trial to Access Article
recurrent neural network based Language Modeling in meeting recognition

Conference of the International Speech Communication Association, 2011

Co-Authors: Stefan Kombrink, Martin Karafiat, Tomas Mikolov, Lukas Burget

Abstract:

We use recurrent neural network (RNN) based Language models to improve the BUT English meeting recognizer. On the baseline setup using the original Language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and Language model adaptation. When n-gram Language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings. Index Terms: automatic speech recognition, Language Modeling, recurrent neural networks, rescoring, adaptation

15 days free trial to Access Article

Lukas Burget - One of the best experts on this subject based on the ideXlab platform.

empirical evaluation and combination of advanced Language Modeling techniques

Conference of the International Speech Communication Association, 2011

Co-Authors: Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, Jan Cernocký

Abstract:

We present results obtained with several advanced Language Modeling techniques, including class based model, cache model, maximum entropy model, structured Language model, random forest Language model and several types of neural network based Language models. We show results obtained after combining all these models by using linear interpolation. We conclude that for both small and moderately sized tasks, we obtain new state of the art results with combination of models, that is significantly better than performance of any individual model. Obtained perplexity reductions against Good-Turing trigram baseline are over 50% and against modified Kneser-Ney smoothed 5-gram over 40%. Index Terms: Language Modeling, neural networks, model combination, speech recognition

15 days free trial to Access Article
RNNLM --- Recurrent Neural Network Language Modeling Toolkit

Proceedings of ASRU 2011, 2011

Co-Authors: Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, Jan Honza Černocký

Abstract:

—We present freely available open-source toolkit for training recurrent neural network based Language models. It can be easily used to improve existing speech recognition and machine translation systems. Also, it can be used as a baseline for future research of advanced Language Modeling techniques. In the paper, we discuss optimal parameter selection and different modes of functionality. The toolkit, example scripts and basic setups are freely available at http://rnnlm.sourceforge.net/.

15 days free trial to Access Article
recurrent neural network based Language Modeling in meeting recognition

Conference of the International Speech Communication Association, 2011

Co-Authors: Stefan Kombrink, Martin Karafiat, Tomas Mikolov, Lukas Burget

Abstract:

We use recurrent neural network (RNN) based Language models to improve the BUT English meeting recognizer. On the baseline setup using the original Language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and Language model adaptation. When n-gram Language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings. Index Terms: automatic speech recognition, Language Modeling, recurrent neural networks, rescoring, adaptation

15 days free trial to Access Article
recurrent neural network based Language Modeling in meeting recognition

Conference of the International Speech Communication Association, 2011

Co-Authors: Stefan Kombrink, Martin Karafiat, Tomas Mikolov, Lukas Burget

Abstract:

We use recurrent neural network (RNN) based Language models to improve the BUT English meeting recognizer. On the baseline setup using the original Language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and Language model adaptation. When n-gram Language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings. Index Terms: automatic speech recognition, Language Modeling, recurrent neural networks, rescoring, adaptation

15 days free trial to Access Article

Stefan Kombrink - One of the best experts on this subject based on the ideXlab platform.

empirical evaluation and combination of advanced Language Modeling techniques

Conference of the International Speech Communication Association, 2011

Co-Authors: Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, Jan Cernocký

Abstract:

We present results obtained with several advanced Language Modeling techniques, including class based model, cache model, maximum entropy model, structured Language model, random forest Language model and several types of neural network based Language models. We show results obtained after combining all these models by using linear interpolation. We conclude that for both small and moderately sized tasks, we obtain new state of the art results with combination of models, that is significantly better than performance of any individual model. Obtained perplexity reductions against Good-Turing trigram baseline are over 50% and against modified Kneser-Ney smoothed 5-gram over 40%. Index Terms: Language Modeling, neural networks, model combination, speech recognition

15 days free trial to Access Article
RNNLM --- Recurrent Neural Network Language Modeling Toolkit

Proceedings of ASRU 2011, 2011

Co-Authors: Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, Jan Honza Černocký

Abstract:

—We present freely available open-source toolkit for training recurrent neural network based Language models. It can be easily used to improve existing speech recognition and machine translation systems. Also, it can be used as a baseline for future research of advanced Language Modeling techniques. In the paper, we discuss optimal parameter selection and different modes of functionality. The toolkit, example scripts and basic setups are freely available at http://rnnlm.sourceforge.net/.

15 days free trial to Access Article
recurrent neural network based Language Modeling in meeting recognition

Conference of the International Speech Communication Association, 2011

Co-Authors: Stefan Kombrink, Martin Karafiat, Tomas Mikolov, Lukas Burget

Abstract:

We use recurrent neural network (RNN) based Language models to improve the BUT English meeting recognizer. On the baseline setup using the original Language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and Language model adaptation. When n-gram Language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings. Index Terms: automatic speech recognition, Language Modeling, recurrent neural networks, rescoring, adaptation

15 days free trial to Access Article
recurrent neural network based Language Modeling in meeting recognition

Conference of the International Speech Communication Association, 2011

Co-Authors: Stefan Kombrink, Martin Karafiat, Tomas Mikolov, Lukas Burget

Abstract:

We use recurrent neural network (RNN) based Language models to improve the BUT English meeting recognizer. On the baseline setup using the original Language models we decrease word error rate (WER) more than 1% absolute by n-best list rescoring and Language model adaptation. When n-gram Language models are trained on the same moderately sized data set as the RNN models, improvements are higher yielding a system which performs comparable to the baseline. A noticeable improvement was observed with unsupervised adaptation of RNN models. Furthermore, we examine the influence of word history on WER and show how to speed-up rescoring by caching common prefix strings. Index Terms: automatic speech recognition, Language Modeling, recurrent neural networks, rescoring, adaptation

15 days free trial to Access Article

Hermann Ney - One of the best experts on this subject based on the ideXlab platform.

Language Modeling with deep transformers

Conference of the International Speech Communication Association, 2019

Co-Authors: Kazuki Irie, Ralf Schlüter, Albert Zeyer, Hermann Ney

Abstract:

We explore deep autoregressive Transformer models in Language Modeling for speech recognition. We focus on two aspects. First, we revisit Transformer model configurations specifically for Language Modeling. We show that well configured Transformer models outperform our baseline models based on the shallow stack of LSTM recurrent neural network layers. We carry out experiments on the open-source LibriSpeech 960hr task, for both 200K vocabulary word-level and 10K byte-pair encoding subword-level Language Modeling. We apply our word-level models to conventional hybrid speech recognition by lattice rescoring, and the subword-level models to attention based encoder-decoder models by shallow fusion. Second, we show that deep Transformer Language models do not require positional encoding. The positional encoding is an essential augmentation for the self-attention mechanism which is invariant to sequence ordering. However, in autoregressive setup, as is the case for Language Modeling, the amount of information increases along the position dimension, which is a positional signal by its own. The analysis of attention weights shows that deep autoregressive self-attention models can automatically make use of such positional information. We find that removing the positional encoding even slightly improves the performance of these models.

15 days free trial to Access Article
lstm neural networks for Language Modeling

Conference of the International Speech Communication Association, 2012

Co-Authors: Martin Sundermeyer, Ralf Schlüter, Hermann Ney

Abstract:

Neural networks have become increasingly popular for the task of Language Modeling. Whereas feed-forward networks only exploit a fixed context length to predict the next word of a sequence, conceptually, standard recurrent neural networks can take into account all of the predecessor words. On the other hand, it is well known that recurrent networks are difficult to train and therefore are unlikely to show the full potential of recurrent models. These problems are addressed by a the Long Short-Term Memory neural network architecture. In this work, we analyze this type of network on an English and a large French Language Modeling task. Experiments show improvements of about 8 % relative in perplexity over standard recurrent neural network LMs. In addition, we gain considerable improvements in WER on top of a state-of-the-art speech recognition system.

15 days free trial to Access Article
improved backing off for m gram Language Modeling

International Conference on Acoustics Speech and Signal Processing, 1995

Co-Authors: Reinhard Kneser, Hermann Ney

Abstract:

In stochastic Language Modeling, backing-off is a widely used method to cope with the sparse data problem. In case of unseen events this method backs off to a less specific distribution. In this paper we propose to use distributions which are especially optimized for the task of backing-off. Two different theoretical derivations lead to distributions which are quite different from the probability distributions that are usually used for backing-off. Experiments show an improvement of about 10% in terms of perplexity and 5% in terms of word error rate.

15 days free trial to Access Article

Jan Honza Černocký - One of the best experts on this subject based on the ideXlab platform.

RNNLM --- Recurrent Neural Network Language Modeling Toolkit

Proceedings of ASRU 2011, 2011

Co-Authors: Tomas Mikolov, Anoop Deoras, Stefan Kombrink, Lukas Burget, Jan Honza Černocký

Abstract:

—We present freely available open-source toolkit for training recurrent neural network based Language models. It can be easily used to improve existing speech recognition and machine translation systems. Also, it can be used as a baseline for future research of advanced Language Modeling techniques. In the paper, we discuss optimal parameter selection and different modes of functionality. The toolkit, example scripts and basic setups are freely available at http://rnnlm.sourceforge.net/.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Tomas Mikolov - One of the best experts on this subject based on the ideXlab platform.

one billion word benchmark for measuring progress in statistical Language Modeling

empirical evaluation and combination of advanced Language Modeling techniques

RNNLM --- Recurrent Neural Network Language Modeling Toolkit

recurrent neural network based Language Modeling in meeting recognition

recurrent neural network based Language Modeling in meeting recognition

Lukas Burget - One of the best experts on this subject based on the ideXlab platform.

empirical evaluation and combination of advanced Language Modeling techniques

RNNLM --- Recurrent Neural Network Language Modeling Toolkit

recurrent neural network based Language Modeling in meeting recognition

recurrent neural network based Language Modeling in meeting recognition

Stefan Kombrink - One of the best experts on this subject based on the ideXlab platform.

empirical evaluation and combination of advanced Language Modeling techniques

RNNLM --- Recurrent Neural Network Language Modeling Toolkit

recurrent neural network based Language Modeling in meeting recognition

recurrent neural network based Language Modeling in meeting recognition

Hermann Ney - One of the best experts on this subject based on the ideXlab platform.

Language Modeling with deep transformers

lstm neural networks for Language Modeling

improved backing off for m gram Language Modeling

Jan Honza Černocký - One of the best experts on this subject based on the ideXlab platform.

RNNLM --- Recurrent Neural Network Language Modeling Toolkit

Language Modeling

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Tomas Mikolov - One of the best experts on this subject based on the ideXlab platform.

Lukas Burget - One of the best experts on this subject based on the ideXlab platform.

Stefan Kombrink - One of the best experts on this subject based on the ideXlab platform.

Hermann Ney - One of the best experts on this subject based on the ideXlab platform.

Jan Honza Černocký - One of the best experts on this subject based on the ideXlab platform.

Related terms