Speech Recognizer

The Experts below are selected from a list of 9726 Experts worldwide ranked by ideXlab platform

Harvey F. Silverman - One of the best experts on this subject based on the ideXlab platform.

Performance of an HMM Speech Recognizer using a real-time tracking microphone array as input

IEEE Transactions on Speech and Audio Processing, 1999

Co-Authors: T.b. Hughes, Hongseok Kim, J.h. Dibiase, Harvey F. Silverman

Abstract:

This correspondence reports results for a tracking, real-time microphone array as an input to a hidden Markov model based (HMM-based) connected alpha-digits Speech Recognizer. For a talker in the near field of the array (within 0.5 m), performance approaches that of a close-talking microphone input device.

15 days free trial to Access Article
using a real time tracking microphone array as input to an hmm Speech Recognizer

International Conference on Acoustics Speech and Signal Processing, 1998

Co-Authors: T.b. Hughes, Hongseok Kim, J.h. Dibiase, Harvey F. Silverman

Abstract:

A major problem for Speech recognition systems is relieving the talker of the need to use a close-talking, head-mounted or a deskstand microphone. A likely solution is the use of an array of microphones that can steer itself to the talker and can use a beamforming algorithm to overcome the reduced signal-to-noise ratio due to room acoustics. This paper reports results for a tracking, real-time microphone-array as an input to an HMM-based connected alpha-digits Speech Recognizer. For a talker in the very near field of the array (within a meter), performance approaches that of a close-talking microphone input device. The effects of both the noise reducing steered array and the use of a maximum a posteriori (MAP) training step are shown to be significant. Here, the array system and the Recognizer are described, experiments are presented, and the implications of combining these two systems discussed.

15 days free trial to Access Article

Wonyong Sung - One of the best experts on this subject based on the ideXlab platform.

a real time fpga based 20 000 word Speech Recognizer with optimized dram access

IEEE Transactions on Circuits and Systems, 2010

Co-Authors: Youngkyu Choi, Jungwook Choi, Wonyong Sung

Abstract:

A real-time hardware-based large vocabulary Speech Recognizer requires high memory bandwidth. We have developed a field-programmable-gate-array (FPGA)-based 20 000-word Speech Recognizer utilizing efficient dynamic random access memory (DRAM) access. This system contains all the functional blocks for hidden-Markov-model-based speaker-independent continuous Speech recognition: feature extraction, emission probability computation, and intraword and interword Viterbi beam search. The feature extraction is conducted in software on a soft-core-based CPU, while the other functional units are implemented using parallel and pipelined hardware blocks. In order to reduce the number of memory access operations, we used several techniques such as bitwidth reduction of the Gaussian parameters, multiframe computation of the emission probability, and two-stage language model pruning. We also employ a customized DRAM controller that supports various access patterns optimized for each functional unit of the Speech Recognizer. The Speech recognition hardware was synthesized for the Virtex-4 FPGA, and it operates at 100 MHz. The experimental result on Nov 92 20 k test set shows that the developed system runs 1.52 and 1.39 times faster than real time using the bigram and trigram language models, respectively.

15 days free trial to Access Article
openmp based parallel implementation of a continuous Speech Recognizer on a multi core system

International Conference on Acoustics Speech and Signal Processing, 2009

Co-Authors: Wonyong Sung

Abstract:

We have implemented a 20,000-word continuous Speech Recognizer on a multi-core based system. A fine grain parallel processing approach is employed for good scalability, and the OpenMP library is used for enhanced portability. In the emission probability computation, a dynamic workload distribution method is employed for good load balancing. However, the search network involved in the Viterbi beam search is statically partitioned into independent subtrees to reduce memory synchronization overhead. In order to further improve the performance, a workload predictive thread assignment strategy as well as a false cache line sharing prevention method are employed. The test was conducted using WSJ1 20k test and development set. We achieved the speed-up of 3.90 by utilizing four threads parallelization in a four-core system compared to four copies of the baseline single thread Speech Recognizer running simultaneously. The final recognition system runs about twice the speed of the real-time requirement.

15 days free trial to Access Article
ICASSP - OpenMP-based parallel implementation of a continuous Speech Recognizer on a multi-core system

2009 IEEE International Conference on Acoustics Speech and Signal Processing, 2009

Co-Authors: Kisun You, Young-joon Lee, Wonyong Sung

Abstract:

We have implemented a 20,000-word continuous Speech Recognizer on a multi-core based system. A fine grain parallel processing approach is employed for good scalability, and the OpenMP library is used for enhanced portability. In the emission probability computation, a dynamic workload distribution method is employed for good load balancing. However, the search network involved in the Viterbi beam search is statically partitioned into independent subtrees to reduce memory synchronization overhead. In order to further improve the performance, a workload predictive thread assignment strategy as well as a false cache line sharing prevention method are employed. The test was conducted using WSJ1 20k test and development set. We achieved the speed-up of 3.90 by utilizing four threads parallelization in a four-core system compared to four copies of the baseline single thread Speech Recognizer running simultaneously. The final recognition system runs about twice the speed of the real-time requirement.

15 days free trial to Access Article
fpga based implementation of a real time 5000 word continuous Speech Recognizer

European Signal Processing Conference, 2008

Co-Authors: Youngkyu Choi, Wonyong Sung

Abstract:

We have developed a hidden Markov model based 5000-word speaker independent continuous Speech Recognizer using a Field-Programmable Gate Array (FPGA). The feature extraction is conducted in software on a soft-core based CPU, while the emission probability computation and the Viterbi beam search are implemented using parallel and pipelined hardware blocks. In order to reduce the bandwidth requirement to external DRAM, we employed bit-width reduction of the Gaussian parameters, multi-block computation of the emission probability, and two-stage language model pruning. These optimizations reduce the memory bandwidth requirement for emission probability computation and inter-word transition by 81% and 44%, respectively. The Speech recognition hardware was synthesized for the Virtex-4 FPGA, and it operates at 100MHz. The experimental result on Wall Street Journal 5k vocabulary task shows that the developed system runs 1.52 times faster than real-time.

15 days free trial to Access Article

Milind Mahajan - One of the best experts on this subject based on the ideXlab platform.

microsoft windows highly intelligent Speech Recognizer whisper

International Conference on Acoustics Speech and Signal Processing, 1995

Co-Authors: Xuedong Huang, Meiyuh Hwang, Alejandro Acero, F Alleva, Li Jiang, Milind Mahajan

Abstract:

Since January 1993, the authors have been working to refine and extend Sphinx-II technologies in order to develop practical Speech recognition at Microsoft. The result of that work has been the Whisper (Windows Highly Intelligent Speech Recognizer). Whisper represents significantly improved recognition efficiency, usability, and accuracy, when compared with the Sphinx-II system. In addition Whisper offers Speech input capabilities for Microsoft Windows and can be scaled to meet different PC platform configurations. It provides features such as continuous Speech recognition, speaker-independence, on-line adaptation, noise robustness, dynamic vocabularies and grammars. For typical Windows Command-and-Control applications (less than 1000 words), Whisper provides a software only solution on PCs equipped with a 486DX, 4MB of memory, and a standard sound card and a desk-top microphone.

15 days free trial to Access Article
ICASSP - Microsoft Windows highly intelligent Speech Recognizer: Whisper

1995 International Conference on Acoustics Speech and Signal Processing, 1

Co-Authors: Xuedong Huang, Alex Acero, Meiyuh Hwang, F Alleva, Li Jiang, Milind Mahajan

Abstract:

Since January 1993, the authors have been working to refine and extend Sphinx-II technologies in order to develop practical Speech recognition at Microsoft. The result of that work has been the Whisper (Windows Highly Intelligent Speech Recognizer). Whisper represents significantly improved recognition efficiency, usability, and accuracy, when compared with the Sphinx-II system. In addition Whisper offers Speech input capabilities for Microsoft Windows and can be scaled to meet different PC platform configurations. It provides features such as continuous Speech recognition, speaker-independence, on-line adaptation, noise robustness, dynamic vocabularies and grammars. For typical Windows Command-and-Control applications (less than 1000 words), Whisper provides a software only solution on PCs equipped with a 486DX, 4MB of memory, and a standard sound card and a desk-top microphone.

15 days free trial to Access Article

T.b. Hughes - One of the best experts on this subject based on the ideXlab platform.

Performance of an HMM Speech Recognizer using a real-time tracking microphone array as input

IEEE Transactions on Speech and Audio Processing, 1999

Co-Authors: T.b. Hughes, Hongseok Kim, J.h. Dibiase, Harvey F. Silverman

Abstract:

This correspondence reports results for a tracking, real-time microphone array as an input to a hidden Markov model based (HMM-based) connected alpha-digits Speech Recognizer. For a talker in the near field of the array (within 0.5 m), performance approaches that of a close-talking microphone input device.

15 days free trial to Access Article
using a real time tracking microphone array as input to an hmm Speech Recognizer

International Conference on Acoustics Speech and Signal Processing, 1998

Co-Authors: T.b. Hughes, Hongseok Kim, J.h. Dibiase, Harvey F. Silverman

Abstract:

A major problem for Speech recognition systems is relieving the talker of the need to use a close-talking, head-mounted or a deskstand microphone. A likely solution is the use of an array of microphones that can steer itself to the talker and can use a beamforming algorithm to overcome the reduced signal-to-noise ratio due to room acoustics. This paper reports results for a tracking, real-time microphone-array as an input to an HMM-based connected alpha-digits Speech Recognizer. For a talker in the very near field of the array (within a meter), performance approaches that of a close-talking microphone input device. The effects of both the noise reducing steered array and the use of a maximum a posteriori (MAP) training step are shown to be significant. Here, the array system and the Recognizer are described, experiments are presented, and the implications of combining these two systems discussed.

15 days free trial to Access Article

Kuldip K. Paliwal - One of the best experts on this subject based on the ideXlab platform.

ICASSP (2) - Use of temporal correlation between successive frames in a hidden Markov model based Speech Recognizer

IEEE International Conference on Acoustics Speech and Signal Processing, 1993

Co-Authors: Kuldip K. Paliwal

Abstract:

The temporal correlation between successive frames is incorporated in an HMM (hidden Markov model) based Speech Recognizer. This is done by making the probability of the current observation vector dependent on the previous observation vectors. Preliminary results show that this approach provides significant improvement in recognition performance (with the use of temporal correlation between two successive frames alone). >

15 days free trial to Access Article
use of temporal correlation between successive frames in a hidden markov model based Speech Recognizer

International Conference on Acoustics Speech and Signal Processing, 1993

Co-Authors: Kuldip K. Paliwal

Abstract:

The temporal correlation between successive frames is incorporated in an HMM (hidden Markov model) based Speech Recognizer. This is done by making the probability of the current observation vector dependent on the previous observation vectors. Preliminary results show that this approach provides significant improvement in recognition performance (with the use of temporal correlation between two successive frames alone). >

15 days free trial to Access Article
ICASSP - Lexicon-building methods for an acoustic sub-word based Speech Recognizer

International Conference on Acoustics Speech and Signal Processing, 1

Co-Authors: Kuldip K. Paliwal

Abstract:

The use of an acoustic subword unit (ASWU)-based Speech recognition system for the recognition of isolated words is discussed. Some methods are proposed for generating the deterministic and the statistical types of word lexicon. It is shown that the use of a modified k-means algorithm on the likelihoods derived through the Viterbi algorithm provides the best deterministic-type of word lexicon. However, the ASWU-based Speech Recognizer leads to better performance with the statistical type of word lexicon than with the deterministic type. Improving the design of the word lexicon makes it possible to narrow the gap in the recognition performances of the whole word unit (WWU)-based and the ASWU-based Speech Recognizers considerably. Further improvements are expected by designing the word lexicon better. >

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Harvey F. Silverman - One of the best experts on this subject based on the ideXlab platform.

Performance of an HMM Speech Recognizer using a real-time tracking microphone array as input

using a real time tracking microphone array as input to an hmm Speech Recognizer

Wonyong Sung - One of the best experts on this subject based on the ideXlab platform.

a real time fpga based 20 000 word Speech Recognizer with optimized dram access

openmp based parallel implementation of a continuous Speech Recognizer on a multi core system

ICASSP - OpenMP-based parallel implementation of a continuous Speech Recognizer on a multi-core system

fpga based implementation of a real time 5000 word continuous Speech Recognizer

Milind Mahajan - One of the best experts on this subject based on the ideXlab platform.

microsoft windows highly intelligent Speech Recognizer whisper

ICASSP - Microsoft Windows highly intelligent Speech Recognizer: Whisper

T.b. Hughes - One of the best experts on this subject based on the ideXlab platform.

Performance of an HMM Speech Recognizer using a real-time tracking microphone array as input

using a real time tracking microphone array as input to an hmm Speech Recognizer

Kuldip K. Paliwal - One of the best experts on this subject based on the ideXlab platform.

ICASSP (2) - Use of temporal correlation between successive frames in a hidden Markov model based Speech Recognizer

use of temporal correlation between successive frames in a hidden markov model based Speech Recognizer

ICASSP - Lexicon-building methods for an acoustic sub-word based Speech Recognizer

Speech Recognizer

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Harvey F. Silverman - One of the best experts on this subject based on the ideXlab platform.

Wonyong Sung - One of the best experts on this subject based on the ideXlab platform.

Milind Mahajan - One of the best experts on this subject based on the ideXlab platform.

T.b. Hughes - One of the best experts on this subject based on the ideXlab platform.

Kuldip K. Paliwal - One of the best experts on this subject based on the ideXlab platform.

Related terms