Source Signal

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 18807 Experts worldwide ranked by ideXlab platform

Jin Ho Choi - One of the best experts on this subject based on the ideXlab platform.

  • underdetermined high resolution doa estimation a 2 rho th order Source Signal noise subspace constrained optimization
    IEEE Transactions on Signal Processing, 2015
    Co-Authors: Jin Ho Choi
    Abstract:

    For estimating the direction of arrival (DOA)s of non-stationary Source Signals such as speech and audio, a constrained optimization problem (COP) that exploits the spatial diversity provided by an array of sensors is formulated in terms of a noise-eliminated local $2\rho$ th-order cumulant matrix. The COP solution provides a weight vector to the look direction such that it is constrained to the $2\rho$ th-order Source-Signal subspace when the look direction is in alignment with the true DOA; otherwise, it is constrained to the $2\rho$ th-order noise subspace. This weight vector is incorporated into the spatial spectrum to determine the degree of orthogonality between itself and either the $2\rho$ th-order Source-Signal subspace when the number of Sources is unknown, or the $2\rho$ th-order noise subspace when the number of Sources is known. For a uniform linear array (ULA) of $M$ sensors, the spatial spectrum for known number of Sources can theoretically be shown to identify up to $2\rho(M-1)$ Sources. Realizing the difficulty in identifying stationarity in the received sensor Signals, the estimate of the noise-eliminated local $2\rho$ th-order cumulant matrix is marginalized over various possible stationary segmentations, for a more robust DOA estimation. In this paper, we focus on the use of local second and fourth order cumulants ( $\rho=1$ , 2), and the proposed algorithms when $\rho=1$ outperformed the KR subspace-based algorithms and also the 4-MUSIC for globally non-stationary, non-Gaussian synthetic data and also for speech/audio in various adverse environments. We verified that the identifiability for $\rho=2$ is improved by two-folds compared to that for $\rho=1$ with an ULA.

  • Underdetermined High-Resolution DOA Estimation: A $2\rho$ th-Order Source-Signal/Noise Subspace Constrained Optimization
    IEEE Transactions on Signal Processing, 2015
    Co-Authors: Jin Ho Choi
    Abstract:

    For estimating the direction of arrival (DOA)s of non-stationary Source Signals such as speech and audio, a constrained optimization problem (COP) that exploits the spatial diversity provided by an array of sensors is formulated in terms of a noise-eliminated local 2ρth-order cumulant matrix. The COP solution provides a weight vector to the look direction such that it is constrained to the 2ρth-order Source-Signal subspace when the look direction is in alignment with the true DOA; otherwise, it is constrained to the 2ρth-order noise subspace. This weight vector is incorporated into the spatial spectrum to determine the degree of orthogonality between itself and either the 2ρth-order Source-Signal subspace when the number of Sources is unknown, or the 2ρth-order noise subspace when the number of Sources is known. For a uniform linear array (ULA) of M sensors, the spatial spectrum for known number of Sources can theoretically be shown to identify up to 2ρ(M-1) Sources. Realizing the difficulty in identifying stationarity in the received sensor Signals, the estimate of the noise-eliminated local 2ρth-order cumulant matrix is marginalized over various possible stationary segmentations, for a more robust DOA estimation. In this paper, we focus on the use of local second and fourth order cumulants ( ρ = 1, 2), and the proposed algorithms when ρ = 1 outperformed the KR subspace-based algorithms and also the 4-MUSIC for globally non-stationary, non-Gaussian synthetic data and also for speech/audio in various adverse environments. We verified that the identifiability for ρ = 2 is improved by two-folds compared to that for ρ = 1 with an ULA.

Patrick A Naylor - One of the best experts on this subject based on the ideXlab platform.

  • data driven voice Source waveform analysis and synthesis
    Speech Communication, 2012
    Co-Authors: Jon Gudnason, Mark R P Thomas, Daniel P W Ellis, Patrick A Naylor
    Abstract:

    A data-driven approach is introduced for studying, analyzing and processing the voice Source Signal. Existing approaches parameterize the voice Source Signal by using models that are motivated, for example, by a physical model or function-fitting. Such parameterization is often difficult to achieve and it produces a poor approximation to a large variety of real voice Source waveforms of the human voice. This paper presents a novel data-driven approach to analyze different types of voice Source waveforms using principal component analysis and Gaussian mixture modeling. This approach models certain voice Source features that many other approaches fail to model. Prototype voice Source waveforms are obtained from each mixture component and analyzed with respect to speaker, phone and pitch. An analysis/synthesis scheme was set up to demonstrate the effectiveness of the method. Compression of the proposed voice Source by discarding 75% of the features yields a segmental Signal-to-reconstruction error ratio of 13dB and a Bark spectral distortion of 0.14.

Paavo Alku - One of the best experts on this subject based on the ideXlab platform.

  • utilizing glottal Source pulse library for generating improved excitation Signal for hmm based speech synthesis
    International Conference on Acoustics Speech and Signal Processing, 2011
    Co-Authors: Tuomo Raitio, Antti Suni, Hannu Pulakka, Martti Vainio, Paavo Alku
    Abstract:

    This paper describes a Source modeling method for hidden Markov model (HMM) based speech synthesis for improved naturalness. A speech corpus is first decomposed into the glottal Source Signal and the model of the vocal tract filter using glottal inverse filtering, and parametrized into excitation and spectral features. Additionally, a library of glottal Source pulses is extracted from the estimated voice Source Signal. In the synthesis stage, the excitation Signal is generated by selecting appropriate pulses from the library according to the target cost of the excitation features and a concatenation cost between adjacent glottal Source pulses. Finally, speech is synthesized by filtering the excitation Signal by the vocal tract filter. Experiments show that the naturalness of the synthetic speech is better or equal, and speaker similarity is better, compared to a system using only single glottal Source pulse.

  • ICASSP - Utilizing glottal Source pulse library for generating improved excitation Signal for HMM-based speech synthesis
    2011 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2011
    Co-Authors: Tuomo Raitio, Antti Suni, Hannu Pulakka, Martti Vainio, Paavo Alku
    Abstract:

    This paper describes a Source modeling method for hidden Markov model (HMM) based speech synthesis for improved naturalness. A speech corpus is first decomposed into the glottal Source Signal and the model of the vocal tract filter using glottal inverse filtering, and parametrized into excitation and spectral features. Additionally, a library of glottal Source pulses is extracted from the estimated voice Source Signal. In the synthesis stage, the excitation Signal is generated by selecting appropriate pulses from the library according to the target cost of the excitation features and a concatenation cost between adjacent glottal Source pulses. Finally, speech is synthesized by filtering the excitation Signal by the vocal tract filter. Experiments show that the naturalness of the synthetic speech is better or equal, and speaker similarity is better, compared to a system using only single glottal Source pulse.

Jon Gudnason - One of the best experts on this subject based on the ideXlab platform.

  • data driven voice Source waveform analysis and synthesis
    Speech Communication, 2012
    Co-Authors: Jon Gudnason, Mark R P Thomas, Daniel P W Ellis, Patrick A Naylor
    Abstract:

    A data-driven approach is introduced for studying, analyzing and processing the voice Source Signal. Existing approaches parameterize the voice Source Signal by using models that are motivated, for example, by a physical model or function-fitting. Such parameterization is often difficult to achieve and it produces a poor approximation to a large variety of real voice Source waveforms of the human voice. This paper presents a novel data-driven approach to analyze different types of voice Source waveforms using principal component analysis and Gaussian mixture modeling. This approach models certain voice Source features that many other approaches fail to model. Prototype voice Source waveforms are obtained from each mixture component and analyzed with respect to speaker, phone and pitch. An analysis/synthesis scheme was set up to demonstrate the effectiveness of the method. Compression of the proposed voice Source by discarding 75% of the features yields a segmental Signal-to-reconstruction error ratio of 13dB and a Bark spectral distortion of 0.14.

Tuomo Raitio - One of the best experts on this subject based on the ideXlab platform.

  • utilizing glottal Source pulse library for generating improved excitation Signal for hmm based speech synthesis
    International Conference on Acoustics Speech and Signal Processing, 2011
    Co-Authors: Tuomo Raitio, Antti Suni, Hannu Pulakka, Martti Vainio, Paavo Alku
    Abstract:

    This paper describes a Source modeling method for hidden Markov model (HMM) based speech synthesis for improved naturalness. A speech corpus is first decomposed into the glottal Source Signal and the model of the vocal tract filter using glottal inverse filtering, and parametrized into excitation and spectral features. Additionally, a library of glottal Source pulses is extracted from the estimated voice Source Signal. In the synthesis stage, the excitation Signal is generated by selecting appropriate pulses from the library according to the target cost of the excitation features and a concatenation cost between adjacent glottal Source pulses. Finally, speech is synthesized by filtering the excitation Signal by the vocal tract filter. Experiments show that the naturalness of the synthetic speech is better or equal, and speaker similarity is better, compared to a system using only single glottal Source pulse.

  • ICASSP - Utilizing glottal Source pulse library for generating improved excitation Signal for HMM-based speech synthesis
    2011 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2011
    Co-Authors: Tuomo Raitio, Antti Suni, Hannu Pulakka, Martti Vainio, Paavo Alku
    Abstract:

    This paper describes a Source modeling method for hidden Markov model (HMM) based speech synthesis for improved naturalness. A speech corpus is first decomposed into the glottal Source Signal and the model of the vocal tract filter using glottal inverse filtering, and parametrized into excitation and spectral features. Additionally, a library of glottal Source pulses is extracted from the estimated voice Source Signal. In the synthesis stage, the excitation Signal is generated by selecting appropriate pulses from the library according to the target cost of the excitation features and a concatenation cost between adjacent glottal Source pulses. Finally, speech is synthesized by filtering the excitation Signal by the vocal tract filter. Experiments show that the naturalness of the synthetic speech is better or equal, and speaker similarity is better, compared to a system using only single glottal Source pulse.