Recognition Rate

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 117516 Experts worldwide ranked by ideXlab platform

Atsushi Nakamura - One of the best experts on this subject based on the ideXlab platform.

  • discriminative Recognition Rate estimation for n best list and its application to n best rescoring
    International Conference on Acoustics Speech and Signal Processing, 2013
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. We have proposed a discriminative Recognition Rate estimation (DRRE) method for 1-best Recognition hypotheses and shown its good estimation performance experimentally. In this paper, we extend our DRRE to N-best lists of Recognition hypotheses by modifying its feature extraction procedures and efficiently selecting N-best hypotheses for its discriminative model training. In addition, we apply our extended DRRE to N-best rescoring. In the experiments, the extended DRRE also showed good estimation performance for the N-best lists. And using the estimated Recognition Rates, the 1-best word accuracy was significantly improved by N-best rescoring from the baseline.

  • ICASSP - Discriminative Recognition Rate estimation for N-best list and its application to N-best rescoring
    2013 IEEE International Conference on Acoustics Speech and Signal Processing, 2013
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. We have proposed a discriminative Recognition Rate estimation (DRRE) method for 1-best Recognition hypotheses and shown its good estimation performance experimentally. In this paper, we extend our DRRE to N-best lists of Recognition hypotheses by modifying its feature extraction procedures and efficiently selecting N-best hypotheses for its discriminative model training. In addition, we apply our extended DRRE to N-best rescoring. In the experiments, the extended DRRE also showed good estimation performance for the N-best lists. And using the estimated Recognition Rates, the 1-best word accuracy was significantly improved by N-best rescoring from the baseline.

  • SLT - Recognition Rate estimation based on word alignment network and discriminative error type classification
    2012 IEEE Spoken Language Technology Workshop (SLT), 2012
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. This paper proposes two Recognition Rate estimation methods for continuous speech Recognition. The first is an easy-to-use method based on a word alignment network (WAN) obtained from a word confusion network through simple conversion procedures. A WAN contains the correct (C), substitution error (S), insertion error (I) and deletion error (D) probabilities word-by-word for a Recognition result. By summing these CSID probabilities individually, the percent correct and word accuracy (WACC) can be estimated without using a reference transcription. The second more advanced method refines the CSID probabilities provided by a WAN based on discriminative error type classification (ETC) and estimates the Recognition Rates more accuRately. In the experiments on the MIT lecture speech corpus, we obtained 0.97 of correlation coefficient between the true WACCs calculated by a scoring tool using reference transcriptions and the WACCs estimated from the discriminative ETC results.

Atsunori Ogawa - One of the best experts on this subject based on the ideXlab platform.

  • ICASSP - ASR error detection and Recognition Rate estimation using deep bidirectional recurrent neural networks
    2015 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2015
    Co-Authors: Atsunori Ogawa, Takaaki Hori
    Abstract:

    Recurrent neural networks (RNNs) have recently been applied as the classifiers for sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied for the first time to error detection in automatic speech Recognition (ASR), which is a sequential labeling problem. We investigate three types of ASR error detection tasks, i.e. confidence estimation, out-of-vocabulary word detection and error type classification. We also estimate Recognition Rates from the error type classification results. Experimental results show that the DBRNNs greatly outperform conditional random fields (CRFs), especially for the detection of infrequent error labels. The DBRNNs also slightly outperform the CRFs in Recognition Rate estimation. In addition, experiments using a reduced size of training data suggest that the DBRNNs have a better generalization ability than the CRFs owing to their word vector representation in a low-dimensional continuous space. As a result, the DBRNNs trained using only 20% of the training data show higher error detection performance than the CRFs trained using the full training data.

  • discriminative Recognition Rate estimation for n best list and its application to n best rescoring
    International Conference on Acoustics Speech and Signal Processing, 2013
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. We have proposed a discriminative Recognition Rate estimation (DRRE) method for 1-best Recognition hypotheses and shown its good estimation performance experimentally. In this paper, we extend our DRRE to N-best lists of Recognition hypotheses by modifying its feature extraction procedures and efficiently selecting N-best hypotheses for its discriminative model training. In addition, we apply our extended DRRE to N-best rescoring. In the experiments, the extended DRRE also showed good estimation performance for the N-best lists. And using the estimated Recognition Rates, the 1-best word accuracy was significantly improved by N-best rescoring from the baseline.

  • ICASSP - Discriminative Recognition Rate estimation for N-best list and its application to N-best rescoring
    2013 IEEE International Conference on Acoustics Speech and Signal Processing, 2013
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. We have proposed a discriminative Recognition Rate estimation (DRRE) method for 1-best Recognition hypotheses and shown its good estimation performance experimentally. In this paper, we extend our DRRE to N-best lists of Recognition hypotheses by modifying its feature extraction procedures and efficiently selecting N-best hypotheses for its discriminative model training. In addition, we apply our extended DRRE to N-best rescoring. In the experiments, the extended DRRE also showed good estimation performance for the N-best lists. And using the estimated Recognition Rates, the 1-best word accuracy was significantly improved by N-best rescoring from the baseline.

  • SLT - Recognition Rate estimation based on word alignment network and discriminative error type classification
    2012 IEEE Spoken Language Technology Workshop (SLT), 2012
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. This paper proposes two Recognition Rate estimation methods for continuous speech Recognition. The first is an easy-to-use method based on a word alignment network (WAN) obtained from a word confusion network through simple conversion procedures. A WAN contains the correct (C), substitution error (S), insertion error (I) and deletion error (D) probabilities word-by-word for a Recognition result. By summing these CSID probabilities individually, the percent correct and word accuracy (WACC) can be estimated without using a reference transcription. The second more advanced method refines the CSID probabilities provided by a WAN based on discriminative error type classification (ETC) and estimates the Recognition Rates more accuRately. In the experiments on the MIT lecture speech corpus, we obtained 0.97 of correlation coefficient between the true WACCs calculated by a scoring tool using reference transcriptions and the WACCs estimated from the discriminative ETC results.

Takaaki Hori - One of the best experts on this subject based on the ideXlab platform.

  • ICASSP - ASR error detection and Recognition Rate estimation using deep bidirectional recurrent neural networks
    2015 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2015
    Co-Authors: Atsunori Ogawa, Takaaki Hori
    Abstract:

    Recurrent neural networks (RNNs) have recently been applied as the classifiers for sequential labeling problems. In this paper, deep bidirectional RNNs (DBRNNs) are applied for the first time to error detection in automatic speech Recognition (ASR), which is a sequential labeling problem. We investigate three types of ASR error detection tasks, i.e. confidence estimation, out-of-vocabulary word detection and error type classification. We also estimate Recognition Rates from the error type classification results. Experimental results show that the DBRNNs greatly outperform conditional random fields (CRFs), especially for the detection of infrequent error labels. The DBRNNs also slightly outperform the CRFs in Recognition Rate estimation. In addition, experiments using a reduced size of training data suggest that the DBRNNs have a better generalization ability than the CRFs owing to their word vector representation in a low-dimensional continuous space. As a result, the DBRNNs trained using only 20% of the training data show higher error detection performance than the CRFs trained using the full training data.

  • discriminative Recognition Rate estimation for n best list and its application to n best rescoring
    International Conference on Acoustics Speech and Signal Processing, 2013
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. We have proposed a discriminative Recognition Rate estimation (DRRE) method for 1-best Recognition hypotheses and shown its good estimation performance experimentally. In this paper, we extend our DRRE to N-best lists of Recognition hypotheses by modifying its feature extraction procedures and efficiently selecting N-best hypotheses for its discriminative model training. In addition, we apply our extended DRRE to N-best rescoring. In the experiments, the extended DRRE also showed good estimation performance for the N-best lists. And using the estimated Recognition Rates, the 1-best word accuracy was significantly improved by N-best rescoring from the baseline.

  • ICASSP - Discriminative Recognition Rate estimation for N-best list and its application to N-best rescoring
    2013 IEEE International Conference on Acoustics Speech and Signal Processing, 2013
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. We have proposed a discriminative Recognition Rate estimation (DRRE) method for 1-best Recognition hypotheses and shown its good estimation performance experimentally. In this paper, we extend our DRRE to N-best lists of Recognition hypotheses by modifying its feature extraction procedures and efficiently selecting N-best hypotheses for its discriminative model training. In addition, we apply our extended DRRE to N-best rescoring. In the experiments, the extended DRRE also showed good estimation performance for the N-best lists. And using the estimated Recognition Rates, the 1-best word accuracy was significantly improved by N-best rescoring from the baseline.

  • SLT - Recognition Rate estimation based on word alignment network and discriminative error type classification
    2012 IEEE Spoken Language Technology Workshop (SLT), 2012
    Co-Authors: Atsunori Ogawa, Takaaki Hori, Atsushi Nakamura
    Abstract:

    Techniques for estimating Recognition Rates without using reference transcriptions are essential if we are to judge whether or not speech Recognition technology is applicable to a new task. This paper proposes two Recognition Rate estimation methods for continuous speech Recognition. The first is an easy-to-use method based on a word alignment network (WAN) obtained from a word confusion network through simple conversion procedures. A WAN contains the correct (C), substitution error (S), insertion error (I) and deletion error (D) probabilities word-by-word for a Recognition result. By summing these CSID probabilities individually, the percent correct and word accuracy (WACC) can be estimated without using a reference transcription. The second more advanced method refines the CSID probabilities provided by a WAN based on discriminative error type classification (ETC) and estimates the Recognition Rates more accuRately. In the experiments on the MIT lecture speech corpus, we obtained 0.97 of correlation coefficient between the true WACCs calculated by a scoring tool using reference transcriptions and the WACCs estimated from the discriminative ETC results.

Yang Jian - One of the best experts on this subject based on the ideXlab platform.

  • Effective Way of Improving the Recognition Rate of License Plate’s First Character
    Computer Science, 2013
    Co-Authors: Yang Jian
    Abstract:

    This paper offers a new method of extracting feature for solving the problem of the low Chinese character Recognition Rate resulted from the poor quality of the Chinese character image in the license plate Recognition system.Firstly,the binary Chinese character image that has been segmented is divided into many blocks.Secondly,this paper extracts three stroke pixel’s feature components that include the proportion of stroke pixels in the block,the divergence and the centroid for each block.Thirdly,this paper combines the new feature extraction method with the SVM classifier.At last,agroup of robust classifiers are obtained.The experimental results show that the Chinese character Recognition Rate can be improved greatly.

  • Research and Realization of the Algorithm of Matching Fingerprint Based on Congruent Triangle
    Information and Electronic Engineering, 2008
    Co-Authors: Yang Jian
    Abstract:

    Automatic Fingerprint Identification System(AFIS) has been widely used because of its convenient,fast,accuRate characteristics in the areas of identification.Aiming at how to improve fingerprint Recognition Rate,a new way of gaining reference point in the areas of the traditional triangular match is presented,and it matches fingerprint features combining variable boundary box.It realizes the fingerprint features matching with high Recognition Rate.

Prakasith Kayasith - One of the best experts on this subject based on the ideXlab platform.

  • Speech Clarity Index (ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy
    IEICE Transactions on Information and Systems, 2009
    Co-Authors: Prakasith Kayasith, Thanaruk Theeramunkong
    Abstract:

    It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech Recognition Rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech Recognition system for the speaker. The effectiveness of ψ as a speech Recognition Rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted Recognition Rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two Recognition systems (HMM and ANN). The results show that ψ is a promising indicator for predicting Recognition Rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.

  • FinTAL - Speech confusion index (Ø): a Recognition Rate indicator for dysarthric speakers
    Advances in Natural Language Processing, 2006
    Co-Authors: Prakasith Kayasith, Thanaruk Theeramunkong, Nuttakorn Thubthong
    Abstract:

    This paper presents an automated method to help us assess speech quality of a dysarthric speaker, instead of traditional manual methods that are laborious and subjective. The assessment result can also be a good indicator for predicting the accuracy of speech Recognition that the speaker can benefit from the current speech technology. The so-called speech confusion index (O) is proposed to measure the severity of speech disorder. Based on the dynamic time wrapping (DTW) technique with adaptive slope constraint and accumulate mismatch score, O is developed as a measure of difference between two speech signals. Compared to the manual methods, i.e. articulatory and intelligibility tests, the proposed indicator was shown to be more predictive on Recognition Rate obtained from HMM and ANN. The evaluation was done in terms of three measures, root-mean-square difference, correlation coefficient and rank-order inconsistency. The experimental results on the control set showed that O achieved better prediction than both articulatory and intelligibility tests with the average improvement of 9.56% and 7.86%, respectively.

  • Recognition Rate prediction for dysarthric speech disorder via speech consistency score
    Lecture Notes in Computer Science, 2006
    Co-Authors: Prakasith Kayasith, Thanaruk Theeramunkong, Nuttakorn Thubthong
    Abstract:

    Dysarthria is a collection of motor speech disorder. A severity of dysarthria is traditionally evaluated by human expertise or a group of listener. This paper proposes a new indicator called speech consistency score (SCS). By considering the relation of speech similarity-dissimilarity, SCS can be applied to evaluate the severity of dysarthric speaker. Aside from being used as a tool for speech assessment, SCS can be used to predict the possible outcome of speech Recognition as well. A number of experiments are made to compare predicted Recognition Rates, geneRated by SCS, with the Recognition Rates of two well-known Recognition systems, HMM and ANN. The result shows that the root mean square error between the prediction Rates and Recognition Rates are less than 7.0% (R 2 = 0.74) and 2.5% (R 2 = 0.96) for HMM and ANN, respectively. Moreover, to utilized the use of SCS in general case, the test on unknown Recognition set showed the error of 11 % (R 2 = 0.48) for HMM.