Feature Sequence

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 270 Experts worldwide ranked by ideXlab platform

Jeih-weih Hung - One of the best experts on this subject based on the ideXlab platform.

  • Suppression by Selecting Wavelets for Feature Compression in Distributed Speech Recognition
    IEEE ACM Transactions on Audio Speech and Language Processing, 2018
    Co-Authors: Syu-siang Wang, Payton Lin, Yu Tsao, Jeih-weih Hung
    Abstract:

    Distributed speech recognition DSR splits the processing of data between a mobile device and a network server. In the front-end, Features are extracted and compressed to transmit over a wireless channel to a back-end server, where the incoming stream is received and reconstructed for recognition tasks. In this paper, we propose a Feature compression algorithm termed suppression by selecting wavelets SSW to achieve the two main goals of DSR: Minimizing memory and device requirements while also maintaining or even improving the recognition performance. The SSW approach first applies the discrete wavelet transform DWT to filter the incoming speech Feature Sequence into two temporal subSequences at the client terminal. Feature compression is achieved by keeping the low modulation frequency subSequence while discarding the high frequency counterpart. The low-frequency subSequence is then transmitted across the remote network for specific Feature statistics normalization. Wavelets are favorable for resolving the temporal properties of the Feature Sequence, and the down-sampling process in DWT achieves data compression by reducing the amount of data at the terminal prior to transmission across the network. Once the compressed Features have arrived at the server, the Feature Sequence can be enhanced by statistics normalization, reconstructed with inverse DWT, and compensated with a simple post filter to alleviate any over-smoothing effects from the compression stage. Results on a standard robustness task Aurora-4 and on a Mandarin Chinese news corpus showed SSW outperforms conventional noise-robustness techniques while also providing nearly a 50% compression rate during the transmission stage of DSR systems.

  • ICCA - Leveraging threshold denoising on DCT-based modulation spectrum for noise robust speech recognition
    11th IEEE International Conference on Control & Automation (ICCA), 2014
    Co-Authors: Yen-chih Cheng, Jun-shan Lin, Jeih-weih Hung
    Abstract:

    This paper presents a novel noise robustness algorithm to enhance speech Features in noisy speech recognition. In the presented algorithm, the temporal speech Feature Sequence is first converted to its spectrum via discrete cosine transform (DCT), and then the DCT spectrum is compensated by a thresholding function in order to further shrink the smaller portion. Finally, the updated DCT spectrum is converted back to the temporal domain to obtain the new Feature Sequence. One advantage of the presented method is that the overall compensation process is unsupervised in the sense that no information about noise embedded in speech signals is required. The evaluation via the Aurora-2 connected digit database and task revealed that the presented method can provide significant improvement in recognition accuracy to the speech Features pre-processed by any of the statistics normalization algorithms, including cepstral mean and variance normalization (CMVN), MVN plus ARMA filtering (MVA) and cepstral gain normalization (CGN). We further showed that, using the presented method, simply compensating the low frequency portion gives similar performance on a par with that achieved by compensation over the entire frequency band.

  • ICASSP - Filtering on the temporal probability Sequence in histogram equalization for robust speech recognition
    2013 IEEE International Conference on Acoustics Speech and Signal Processing, 2013
    Co-Authors: Syu-siang Wang, Yu Tsao, Jeih-weih Hung
    Abstract:

    In this paper, we propose a filter-based histogram equalization (FHEQ) approach for robust speech recognition. The FHEQ approach first represents the original acoustic Feature Sequence with statistic probability. Then, a temporal average (TA) filter is applied to smooth the statistic probability Sequence. Finally, the filtered statistic probability Sequence is transformed to form a new acoustic Feature stream. Filtering on statistic probability of a Feature Sequence is a novel concept that can incorporate the advantages of the conventional histogram equalization (HEQ) and temporal filtering techniques for better noise robustness. Our experimental results on the Aurora-2 and Aurora-4 tasks show that FHEQ outperforms the conventional cepstral mean subtraction (CMS), cepstral mean and variance normalization (CMVN), and HEQ. Furthermore, we conducted a comparison test on TA-HEQ and HEQ-TA, which apply a TA filter to smooth acoustic Features before and after the HEQ processing, respectively. The test results show that FHEQ outperforms both TA-HEQ and HEQ-TA, suggesting that filtering in probability is more effective than filtering in acoustic Feature.

  • Subband Feature Statistics Normalization Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
    IEEE Signal Processing Letters, 2009
    Co-Authors: Jeih-weih Hung, Hao-teng Fan
    Abstract:

    This letter proposes a novel scheme that applies Feature statistics normalization techniques for robust speech recognition. In the proposed approach, the processed temporal-domain Feature Sequence is first decomposed into nonuniform subbands using the discrete wavelet transform (DWT), and then each subband stream is individually processed by well-known normalization methods, such as mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the Feature stream with all of the modified subband streams using the inverse DWT. With this process, the components that correspond to more important modulation spectral bands in the Feature Sequence can be processed separately.

  • ICME - Sub-band Feature statistics compensation techniques based on discrete wavelet transform for robust speech recognition
    2009 IEEE International Conference on Multimedia and Expo, 2009
    Co-Authors: Hao-teng Fan, Jeih-weih Hung
    Abstract:

    This paper proposes a novel scheme in performing Feature statistics normalization techniques for robust speech recognition. In the proposed approach, the processed temporal-domain Feature Sequence is first decomposed into non-uniform sub-bands using discrete wavelet transform (DWT), and then each sub-band stream is individually processed by the well-known normalization methods, like mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the Feature stream with all the modified sub-band streams using inverse DWT. With this process, the components that correspond to more important modulation spectral bands in the Feature Sequence can be processed separately. For the Aurora-2 clean-condition training task, the new proposed sub-band MVN and HEQ provide relative error rate reductions of 20.18% and 19.65% over the conventional MVN and HEQ.

Yu-ling Chang - One of the best experts on this subject based on the ideXlab platform.

  • An Automatic Assessment System for Alzheimer's Disease Based on Speech Using Feature Sequence Generator and Recurrent Neural Network.
    Scientific reports, 2019
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Li-hung Yao, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive compared with brain imaging, blood testing, etc. While most of the existing literature extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a data-driven approach, namely, the recurrent neural network to perform classification in this study. The system is also shown to be fully-automated, which implies the system can be deployed widely to all places easily. To validate our study, a series of experiments have been conducted with 120 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.838.

  • an assessment system for alzheimer s disease based on speech using a novel Feature Sequence design and recurrent neural network
    Systems Man and Cybernetics, 2018
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive. While most of the related studies extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a recurrent neural network to perform classification in this paper. To validate our work, an experiment has been conducted with 150 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.954, potentially outperforming the current state-of-the-art method.

  • SMC - An Assessment System for Alzheimer's Disease Based on Speech Using a Novel Feature Sequence Design and Recurrent Neural Network
    2018 IEEE International Conference on Systems Man and Cybernetics (SMC), 2018
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive. While most of the related studies extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a recurrent neural network to perform classification in this paper. To validate our work, an experiment has been conducted with 150 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.954, potentially outperforming the current state-of-the-art method.

Yi-wei Chien - One of the best experts on this subject based on the ideXlab platform.

  • An Automatic Assessment System for Alzheimer's Disease Based on Speech Using Feature Sequence Generator and Recurrent Neural Network.
    Scientific reports, 2019
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Li-hung Yao, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive compared with brain imaging, blood testing, etc. While most of the existing literature extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a data-driven approach, namely, the recurrent neural network to perform classification in this study. The system is also shown to be fully-automated, which implies the system can be deployed widely to all places easily. To validate our study, a series of experiments have been conducted with 120 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.838.

  • an assessment system for alzheimer s disease based on speech using a novel Feature Sequence design and recurrent neural network
    Systems Man and Cybernetics, 2018
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive. While most of the related studies extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a recurrent neural network to perform classification in this paper. To validate our work, an experiment has been conducted with 150 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.954, potentially outperforming the current state-of-the-art method.

  • SMC - An Assessment System for Alzheimer's Disease Based on Speech Using a Novel Feature Sequence Design and Recurrent Neural Network
    2018 IEEE International Conference on Systems Man and Cybernetics (SMC), 2018
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive. While most of the related studies extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a recurrent neural network to perform classification in this paper. To validate our work, an experiment has been conducted with 150 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.954, potentially outperforming the current state-of-the-art method.

Werner Bailer - One of the best experts on this subject based on the ideXlab platform.

  • a Feature Sequence kernel for video concept classification
    Conference on Multimedia Modeling, 2011
    Co-Authors: Werner Bailer
    Abstract:

    Kernel methods such as Support Vector Machines are widely applied to classification problems, including concept detection in video. Nonetheless issues like modeling specific distance functions of Feature descriptors or the temporal Sequence of Features in the kernel have received comparatively little attention in multimedia research. We review work on kernels for commonly used MPEG-7 visual Features and propose a kernel for matching temporal Sequences of these Features. The Sequence kernel is based on ideas from string matching, but does not require discretization of the input Feature vectors and deals with partial matches and gaps. Evaluation on the TRECVID 2007 high-level Feature extraction data set shows that the Sequence kernel clearly outperforms the radial basis function (RBF) kernel and the MPEG-7 visual Feature kernels using only single key frames.

  • MMM (1) - A Feature Sequence kernel for video concept classification
    Lecture Notes in Computer Science, 2011
    Co-Authors: Werner Bailer
    Abstract:

    Kernel methods such as Support Vector Machines are widely applied to classification problems, including concept detection in video. Nonetheless issues like modeling specific distance functions of Feature descriptors or the temporal Sequence of Features in the kernel have received comparatively little attention in multimedia research. We review work on kernels for commonly used MPEG-7 visual Features and propose a kernel for matching temporal Sequences of these Features. The Sequence kernel is based on ideas from string matching, but does not require discretization of the input Feature vectors and deals with partial matches and gaps. Evaluation on the TRECVID 2007 high-level Feature extraction data set shows that the Sequence kernel clearly outperforms the radial basis function (RBF) kernel and the MPEG-7 visual Feature kernels using only single key frames.

Wen-ting Cheah - One of the best experts on this subject based on the ideXlab platform.

  • An Automatic Assessment System for Alzheimer's Disease Based on Speech Using Feature Sequence Generator and Recurrent Neural Network.
    Scientific reports, 2019
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Li-hung Yao, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive compared with brain imaging, blood testing, etc. While most of the existing literature extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a data-driven approach, namely, the recurrent neural network to perform classification in this study. The system is also shown to be fully-automated, which implies the system can be deployed widely to all places easily. To validate our study, a series of experiments have been conducted with 120 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.838.

  • an assessment system for alzheimer s disease based on speech using a novel Feature Sequence design and recurrent neural network
    Systems Man and Cybernetics, 2018
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive. While most of the related studies extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a recurrent neural network to perform classification in this paper. To validate our work, an experiment has been conducted with 150 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.954, potentially outperforming the current state-of-the-art method.

  • SMC - An Assessment System for Alzheimer's Disease Based on Speech Using a Novel Feature Sequence Design and Recurrent Neural Network
    2018 IEEE International Conference on Systems Man and Cybernetics (SMC), 2018
    Co-Authors: Yi-wei Chien, Sheng-yi Hong, Wen-ting Cheah, Yu-ling Chang
    Abstract:

    Alzheimer disease and other dementias have become the 7th cause of death worldwide. Still lacking a cure, an early detection of the disease in order to provide the best intervention is crucial. To develop an assessment system for the general public, speech analysis is the optimal solution since it reflects the speaker's cognitive skills abundantly and data collection is relatively inexpensive. While most of the related studies extracted statistics-based Features and relied on a Feature selection process, we have proposed a novel Feature Sequence representation and utilized a recurrent neural network to perform classification in this paper. To validate our work, an experiment has been conducted with 150 speech samples, and the score in terms of the area under the receiver operating characteristic curve is as high as 0.954, potentially outperforming the current state-of-the-art method.