Speech Analysis

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 240702 Experts worldwide ranked by ideXlab platform

Keiichi Funaki - One of the best experts on this subject based on the ideXlab platform.

  • EUSIPCO - TV-CAR Speech Analysis based on Regularized LP
    2019 27th European Signal Processing Conference (EUSIPCO), 2019
    Co-Authors: Keiichi Funaki
    Abstract:

    Linear Prediction (LP) Analysis is Speech Analysis to estimate AR (Auto-Regressive) coefficients to represent the all-pole spectrum that is applied in Speech synthesis recently besides Speech coding. We have proposed $l_{2} -$ norm optimization-based TV-CAR (Time-Varying Complex AR) Speech Analysis for an analytic signal, MMSE (Minimizing Mean Square Error) or ELS (Extended Least Square) method, and we have applied them into the Speech processing such as robust ASR or F 0 estimation of Speech. On the other hand, B.Kleijn et al. have proposed Regularized Linear Prediction (RLP) method to suppress pitch related bias that is an overestimation of the first formant. In the RLP, $l_{2} -$ norm regularized term that is the norm of spectral changes in the frequencies is introduced to suppress the rapid spectral changes. The RLP estimates the parameter so as to minimize $l_{2} -$ norm criterion added by the $l_{2} -$ norm regularized penalty term. In this paper, the RLP-based TV-CAR Speech Analysis is proposed and evaluated with the F 0 estimation of Speech using IRAPT (Instantaneous RAPT) with Keele Pitch Database under noisy conditions.

  • F 0 contour estimation using ELS-based robust time-varying complex Speech Analysis
    2011 Digital Signal Processing and Signal Processing Education Meeting (DSP SPE), 2011
    Co-Authors: Keiichi Funaki
    Abstract:

    Robust F 0 (Fundamental frequency) estimation plays an important role in Speech processing. This paper proposes simple F 0 contour estimation algorithm based on the robust TV-CAR Speech Analysis, in which the F 0 contour is estimated by peak-picking for the estimated time-varying spectrum by means of ELS-based robust complex Speech Analysis for an analytic Speech signal. The experimental results demonstrate that the proposed method leads to more accurate continuous F 0 estimation than the conventional ones for high-pitched Speech.

  • EUSIPCO - Speech enhancement based on iterative Wiener filter using complex Speech Analysis
    Recent Advances in Signal Processing, 2009
    Co-Authors: Keiichi Funaki
    Abstract:

    Recently, applications of Speech coding and Speech recognition have been exploding; for example, cellular phones and car navigation systems in an automobile. Since these are commonly used in noisy environment, noise reduction method, viz., Speech enhancement is required as a pre-processor for Speech coding and recognition. Iterative Wiener filter (IWF) method has been adopted as the Speech enhancement that estimates Speech and noise power spectra using LPC Analysis iteratively. In this paper, we propose an improved method forWiener filter algorithm by introducing the complex LPC Speech Analysis instead of the conventional LPC Analysis. The complex Speech Analysis can estimate more accurate spectrum in low frequencies, thus it is expected that it can perform better for the IWF especially for babble noise or car internal noise that contains much energy in low frequencies. The objective evaluation has been performed for Speech signal corrupted by white Gaussian, pink noise, babble noise or car internal noise by means of spectral distance. The results demonstrate that the proposed method can perform better for babble or car internal noise than the conventional real-valued method.

  • EUSIPCO - F 0 estimation based on robust ELS complex Speech Analysis
    2008
    Co-Authors: Keiichi Funaki
    Abstract:

    A robust fundamental frequency (F 0 ) estimation algorithm based on robust ELS (Extended Least Square) complex-valued Speech Analysis for an analytic Speech signal is proposed in this paper. Speech spectrum can be accurately estimated in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F 0 estimation using complex residual extracted by the Speech Analysis. We have already proposed F 0 estimation using complex LPC residual, in which the autocorrelation function weighted by AMDF was adopted for the criterion. The method adopted an MMSE-based complex LPC Analysis and it has been reported that it can estimate more accurate F 0 for IRS filtered Speech corrupted by white Gauss noise although it cannot perform better for the IRS filtered Speech corrupted by pink noise. In this paper, robust complex Speech Analysis based on ELS method is introduced in order to solve the problem and evaluated with larger number of Speech data. The experimental results for additive white Gauss or pink noise demonstrate that the proposed algorithm based on robust complex residual can perform better than other methods.

  • Speech enhancement based on iterative wiener filter using complex Speech Analysis
    European Signal Processing Conference, 2008
    Co-Authors: Keiichi Funaki
    Abstract:

    Recently, applications of Speech coding and Speech recognition have been exploding; for example, cellular phones and car navigation systems in an automobile. Since these are commonly used in noisy environment, noise reduction method, viz., Speech enhancement is required as a pre-processor for Speech coding and recognition. Iterative Wiener filter (IWF) method has been adopted as the Speech enhancement that estimates Speech and noise power spectra using LPC Analysis iteratively. In this paper, we propose an improved method forWiener filter algorithm by introducing the complex LPC Speech Analysis instead of the conventional LPC Analysis. The complex Speech Analysis can estimate more accurate spectrum in low frequencies, thus it is expected that it can perform better for the IWF especially for babble noise or car internal noise that contains much energy in low frequencies. The objective evaluation has been performed for Speech signal corrupted by white Gaussian, pink noise, babble noise or car internal noise by means of spectral distance. The results demonstrate that the proposed method can perform better for babble or car internal noise than the conventional real-valued method.

Tatsuhiko Kinjo - One of the best experts on this subject based on the ideXlab platform.

  • Robust F0 Estimation Using ELS-Based Robust Complex Speech Analysis
    IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences, 2008
    Co-Authors: Keiichi Funaki, Tatsuhiko Kinjo
    Abstract:

    Complex Speech Analysis for an analytic Speech signal can accurately estimate the spectrum in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F0 estimation using complex residual signal extracted by complex-valued Speech Analysis. We have already proposed F0 estimation using complex LPC residual, in which the autocorrelation function weighted by AMDF was adopted as the criterion. The method adopted MMSE-based complex LPC Analysis and it has been reported that it can estimate more accurate F0 for IRS filtered Speech corrupted by white Gauss noise although it can not work better for the IRS filtered Speech corrupted by pink noise. In this paper, robust complex Speech Analysis based on ELS (Extended Least Square) method is introduced in order to overcome the drawback. The experimental results for additive white Gauss or pink noise demonstrate that the proposed algorithm based on robust ELS-based complex AR Analysis can perform better than other methods.

  • On HMM Speech Recognition Based on Complex Speech Analysis
    IECON 2006 - 32nd Annual Conference on IEEE Industrial Electronics, 2006
    Co-Authors: Tatsuhiko Kinjo, Keiichi Funaki
    Abstract:

    In Speech recognition, LPC cepstrum based on LPC or MFCC based on Mel-frequency filter bank are widely used as a feature extraction that determines the performance. However, these are not being regarded as the best feature extraction tion. In this paper, we introduce a complex Speech Analysis for an analytic Speech signal to HMM Speech recognition. A complex Speech Analysis can estimate more accurate Speech spectrum in low frequencies, as a result, it is expected that the Speech Analysis can perform well as a feature extractor in Speech recognition. The MMSE-based time-varying complex AR Speech Analysis is adopted and the estimated complex parameters are converted to LPCCs and MFCCs as a feature vector for HTK (HMM Tool Kit) in order to realize the HMM Speech recognition. Through continuous Speech recognition experiments with the converted LPCCs and MFCCs, it was found that the complex Speech Analysis method would not perform well than the real one.

  • Robust F0 estimation based on complex Speech Analysis
    Journal of the Acoustical Society of America, 2006
    Co-Authors: Tatsuhiko Kinjo, Keiichi Funaki
    Abstract:

    This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex‐valued Speech Analysis for an analytic Speech signal. Since the analytic signal provides spectra only over positive frequencies, spectra can be estimated accurately at low frequencies. Consequently, it is considered that F0 estimation using the residual signal extracted by complex‐valued Speech Analysis can perform better for F0 estimation than that for the residual signal extracted by conventional real‐valued LPC Analysis. In this paper, the autocorrelation function (AUTOC) weighted by a reciprocal of the AMDF is also adopted for the F0 estimation criterion. The proposed F0 estimation algorithm is evaluated using three criteria, AUTOC, AMDF, and the weighted AUTOC, with complex‐valued residual extracted by complex‐valued Speech Analysis. We also compared the proposed method with those for three signals: the Speech signal, analytic Speech signal, and LPC residual. Speech signals used in the experiments were...

  • F0 estimation of noisy Speech based on complex Speech Analysis
    2006 IEEE 12th Digital Signal Processing Workshop & 4th IEEE Signal Processing Education Workshop, 2006
    Co-Authors: Tatsuhiko Kinjo, Keiichi Funaki
    Abstract:

    This paper proposes a novel robust fundamental frequency (F0) estimation algorithm based on complex-valued Speech Analysis for an analytic Speech signal. Since analytic signal provides spectra only over positive frequencies, spectra can be accurately estimated in low frequencies. Consequently, it is considered that F0 estimation using the residual signal extracted by complex-valued Speech Analysis can perform better for F0 estimation than that for the residual signal extracted by conventional real-valued LPC Analysis. In this paper, the autocorrelation function weighted by AMDF is adopted for the F0 estimation criterion and four signals; Speech signal, analytic Speech signal, LPC residual and complex LPC residual, are evaluated for the F0 estimation. Speech signals used in the experiments were corrupted by adding white Gaussian noise whose noise levels are 10, 5, 0, -5 [dB]. The experimental results demonstrate that the proposed algorithm based on complex Speech Analysis can perform better than other methods in an extremely noisy environment

Tetsuya Shimamura - One of the best experts on this subject based on the ideXlab platform.

  • Least squares method for accurate Speech Analysis
    10th International Conference on Telecommunications 2003. ICT 2003., 2003
    Co-Authors: Tetsuya Shimamura
    Abstract:

    This paper proposes a strategy to improve the performance of the linear prediction method used for Speech Analysis. The least squares (LS) method, one of system identification methods, is utilized to estimate the coefficients of the all-pole filter and the gain parameter. For the input estimation, a simple and effective method is proposed in which the operation of thresholding is used. It is revealed that the LS method provides a good performance independently on the pitch period of Speech.

  • Noise-robust Speech Analysis using system identification methods
    Electronics and Communications in Japan Part Iii-fundamental Electronic Science, 2002
    Co-Authors: Yuki Arima, Tetsuya Shimamura
    Abstract:

    This paper proposes a modified linear prediction method for Speech Analysis, using two system identification methods—the least-square method and the instrument variable method—for the estimation of the coefficients of an all-pole filter. Whereas the linear prediction method estimates the coefficients of all-pole filters from Speech signals, which are observed output signals, the system identification method estimates coefficients of all-pole filters from observed output signals and the input signals. This paper derives a novel technique that estimates input signals from Speech signals that are observed output signals with a high degree of accuracy and robustness with respect to added noise, by generating improved prediction error signals. The paper also shows that when voiced Speech is to be analyzed, if input signals, which are an impulse chain, can be accurately estimated, the estimation of filter coefficients can yield a high degree of accuracy provided that the least-square method is used, and that in this manner, the pitch period dependency can be removed. We also show that by applying the instrument variable method using an auxiliary model, the accuracy of estimation of filter coefficients in a noisy environment can be substantially improved while maintaining the properties of the least-square method. The effectiveness of these system identification methods for Speech Analysis is demonstrated through computer simulations. © 2002 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 86(3): 20–32, 2003; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjc.1137

  • Noise‐robust Speech Analysis using system identification methods
    Electronics and Communications in Japan (Part III: Fundamental Electronic Science), 2002
    Co-Authors: Yuki Arima, Tetsuya Shimamura
    Abstract:

    This paper proposes a modified linear prediction method for Speech Analysis, using two system identification methods—the least-square method and the instrument variable method—for the estimation of the coefficients of an all-pole filter. Whereas the linear prediction method estimates the coefficients of all-pole filters from Speech signals, which are observed output signals, the system identification method estimates coefficients of all-pole filters from observed output signals and the input signals. This paper derives a novel technique that estimates input signals from Speech signals that are observed output signals with a high degree of accuracy and robustness with respect to added noise, by generating improved prediction error signals. The paper also shows that when voiced Speech is to be analyzed, if input signals, which are an impulse chain, can be accurately estimated, the estimation of filter coefficients can yield a high degree of accuracy provided that the least-square method is used, and that in this manner, the pitch period dependency can be removed. We also show that by applying the instrument variable method using an auxiliary model, the accuracy of estimation of filter coefficients in a noisy environment can be substantially improved while maintaining the properties of the least-square method. The effectiveness of these system identification methods for Speech Analysis is demonstrated through computer simulations. © 2002 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 86(3): 20–32, 2003; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/ecjc.1137

Kinjotatsuhiko - One of the best experts on this subject based on the ideXlab platform.

Nobuo Ohtsuki - One of the best experts on this subject based on the ideXlab platform.

  • A robust Speech Analysis in Speech recognition
    WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000, 2000
    Co-Authors: Yoshikazu Miyanaga, S. Gozen, Nobuo Ohtsuki
    Abstract:

    This report presents an adaptive Speech Analysis method specially used for a Speech recognition system. The designed Speech recognition system consists of an adaptive Speech Analysis, a self-organized clustering/pseudo-labeling method and a DTW. All methods are redesigned in fully parallel and pipelined mechanism. In the Speech Analysis method, an adaptive ARMA lattice modelling is introduced for the reduction of distortion, noise and disturbance. In addition, the Speech Analysis keeps robust condition where an adaptive method is usually considered to be sensitive as to the convergence property and the parameter estimation. By using the recognition system including the robust adaptive Speech analyzer, Speech recognition results are shown.