Voice Parameter

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 6315 Experts worldwide ranked by ideXlab platform

Andrey Talanov - One of the best experts on this subject based on the ideXlab platform.

  • improving speech synthesis quality for Voices created from an audiobook database
    International Conference on Speech and Computer, 2014
    Co-Authors: Pavel Chistikov, Dmitriy Zakharov, Andrey Talanov
    Abstract:

    This paper describes an approach to improving synthesized speech quality for Voices created by using an audiobook database. The data consist of a large amount of read speech by one speaker, which we matched with the corresponding book texts. The main problems with such a database are the following. First, the recordings were made at different times under different acoustic conditions, and the speaker reads the text with a variety of intonations and accents, which leads to very high Voice Parameter variability. Second, automatic techniques for sound file labeling make more errors due to the large variability of the database, especially as there can be mismatches between the text and the corresponding sound files. These problems dramatically affect speech synthesis quality, so a robust method for solving them is vital for Voices created using audiobooks. The approach described in the paper is based on statistical models of Voice Parameters and special algorithms of speech element concatenation and modification. Listening tests show that it strongly improves synthesized speech quality.

  • SPECOM - Improving Speech Synthesis Quality for Voices Created from an Audiobook Database
    Speech and Computer, 2014
    Co-Authors: Pavel Chistikov, Dmitriy Zakharov, Andrey Talanov
    Abstract:

    This paper describes an approach to improving synthesized speech quality for Voices created by using an audiobook database. The data consist of a large amount of read speech by one speaker, which we matched with the corresponding book texts. The main problems with such a database are the following. First, the recordings were made at different times under different acoustic conditions, and the speaker reads the text with a variety of intonations and accents, which leads to very high Voice Parameter variability. Second, automatic techniques for sound file labeling make more errors due to the large variability of the database, especially as there can be mismatches between the text and the corresponding sound files. These problems dramatically affect speech synthesis quality, so a robust method for solving them is vital for Voices created using audiobooks. The approach described in the paper is based on statistical models of Voice Parameters and special algorithms of speech element concatenation and modification. Listening tests show that it strongly improves synthesized speech quality.

Pavel Chistikov - One of the best experts on this subject based on the ideXlab platform.

  • improving speech synthesis quality for Voices created from an audiobook database
    International Conference on Speech and Computer, 2014
    Co-Authors: Pavel Chistikov, Dmitriy Zakharov, Andrey Talanov
    Abstract:

    This paper describes an approach to improving synthesized speech quality for Voices created by using an audiobook database. The data consist of a large amount of read speech by one speaker, which we matched with the corresponding book texts. The main problems with such a database are the following. First, the recordings were made at different times under different acoustic conditions, and the speaker reads the text with a variety of intonations and accents, which leads to very high Voice Parameter variability. Second, automatic techniques for sound file labeling make more errors due to the large variability of the database, especially as there can be mismatches between the text and the corresponding sound files. These problems dramatically affect speech synthesis quality, so a robust method for solving them is vital for Voices created using audiobooks. The approach described in the paper is based on statistical models of Voice Parameters and special algorithms of speech element concatenation and modification. Listening tests show that it strongly improves synthesized speech quality.

  • SPECOM - Improving Speech Synthesis Quality for Voices Created from an Audiobook Database
    Speech and Computer, 2014
    Co-Authors: Pavel Chistikov, Dmitriy Zakharov, Andrey Talanov
    Abstract:

    This paper describes an approach to improving synthesized speech quality for Voices created by using an audiobook database. The data consist of a large amount of read speech by one speaker, which we matched with the corresponding book texts. The main problems with such a database are the following. First, the recordings were made at different times under different acoustic conditions, and the speaker reads the text with a variety of intonations and accents, which leads to very high Voice Parameter variability. Second, automatic techniques for sound file labeling make more errors due to the large variability of the database, especially as there can be mismatches between the text and the corresponding sound files. These problems dramatically affect speech synthesis quality, so a robust method for solving them is vital for Voices created using audiobooks. The approach described in the paper is based on statistical models of Voice Parameters and special algorithms of speech element concatenation and modification. Listening tests show that it strongly improves synthesized speech quality.

Bouvet Anne - One of the best experts on this subject based on the ideXlab platform.

  • Experimental and theoretical contribution to the analysis and the modelling of the vocal folds vibration.
    2019
    Co-Authors: Bouvet Anne
    Abstract:

    La production de la voix humaine est générée par l’auto-oscillation des cordes vocales, due à l’interaction entre le flux d’air venant des poumons et la structure élastique des cordes vocales. Le but de cette thèse est de réaliser une étude expérimentale et théorique permettant de mieux comprendre et de modéliser ce phénomène et certaines de ses perturbations.Premièrement, l'algorithme du MSePGG est proposé pour la calibration d'un dispositif non-invasif de mesure in vivo de l’air glottique. Cet algorithme est validé sur des répliques de cordes vocales et illustré pour des mesures sur locuteurs.Deuxièmement, les cordes vocales sont recouvertes par une fine couche de liquide, essentielle à la phonation. Une approche expérimentale est proposée afin d’étudier l'influence de la présence ce liquide sur des répliques de cordes vocales. Elle démontre que la pulvérisation d'eau a un impact sur les paramètres basique de la voix et sur leur perturbation. Un modèle théorique simplifié tenant compte de la présence de l'air et de l'eau est ensuite proposé et validé.Troisièmement, l'effet de l'asymétrie angulaire verticale des cordes vocales, dans le cas d'une paralysie unilatérale, sur l'interaction fluide-structure est évalué expérimentalement. Il est observé que la perte initiale de contact des cordes vocales entraîne une variation importante des caractéristiques de phonation et de leurs variations. Un modèle théorique simple est adapté pour permettre de prédire l'augmentation de la pression seuil de l'auto-oscillation des cordes vocales. Pour les applications cliniques futures, les résultats obtenus suggèrent la poursuite du développement du MSePGG et illustrent les multiples de causes potentielles de perturbation de la voix.The production of the human Voice is generated by vocal folds auto-oscillation, due to the interaction between the air flow coming from the lungs and the elastic structure of the vocal folds. The purpose of this thesis is to realise an experimental and theoretical study in order to improve the understanding and modelling of this phenomenon and some of its perturbations.Firstly, the MSePGG algorithm is proposed for the calibration of a non-invasive device for in vivo glottal area measurements. The algorithm is validated on mechanical replicas and illustrated for measurements on human speakers.Secondly, the vocal folds are covered by a thin layer of liquid, essential for phonation. An experimental approach is proposed to systematically study the influence of the presence of liquid on vocal fold replicas. Water spraying is shown to impact basic Voice Parameter as well as their perturbation. A simplified theoretical flow model accounting for the presence of both air and water is proposed and validated.Thirdly, the effect of vertical vocal fold angular asymmetry, as occurring in the case of unilateral vocal fold paralysis, on the fluid structure interaction is experimentally assessed. It is found that loss of vocal folds full contact leads to important variation on phonation features and their variations.A simple theoretical model is shown to fit the increase of auto-oscillation onset threshold pressure. For future clinical applications obtained results suggest the further development of the MSePGG device and illustrate the multiple of potential causes of Voice perturbation

  • Contribution expérimentale et théorique à l'analyse et la modélisation de la vibration des cordes vocales
    HAL CCSD, 2019
    Co-Authors: Bouvet Anne
    Abstract:

    The production of the human Voice is generated by vocal folds auto-oscillation, due to the interaction between the air flow coming from the lungs and the elastic structure of the vocal folds. The purpose of this thesis is to realise an experimental and theoretical study in order to improve the understanding and modelling of this phenomenon and some of its perturbations.Firstly, the MSePGG algorithm is proposed for the calibration of a non-invasive device for in vivo glottal area measurements. The algorithm is validated on mechanical replicas and illustrated for measurements on human speakers.Secondly, the vocal folds are covered by a thin layer of liquid, essential for phonation. An experimental approach is proposed to systematically study the influence of the presence of liquid on vocal fold replicas. Water spraying is shown to impact basic Voice Parameter as well as their perturbation. A simplified theoretical flow model accounting for the presence of both air and water is proposed and validated.Thirdly, the effect of vertical vocal fold angular asymmetry, as occurring in the case of unilateral vocal fold paralysis, on the fluid structure interaction is experimentally assessed. It is found that loss of vocal folds full contact leads to important variation on phonation features and their variations.A simple theoretical model is shown to fit the increase of auto-oscillation onset threshold pressure. For future clinical applications obtained results suggest the further development of the MSePGG device and illustrate the multiple of potential causes of Voice perturbation.La production de la voix humaine est générée par l’auto-oscillation des cordes vocales, due à l’interaction entre le flux d’air venant des poumons et la structure élastique des cordes vocales. Le but de cette thèse est de réaliser une étude expérimentale et théorique permettant de mieux comprendre et de modéliser ce phénomène et certaines de ses perturbations.Premièrement, l'algorithme du MSePGG est proposé pour la calibration d'un dispositif non-invasif de mesure in vivo de l’air glottique. Cet algorithme est validé sur des répliques de cordes vocales et illustré pour des mesures sur locuteurs.Deuxièmement, les cordes vocales sont recouvertes par une fine couche de liquide, essentielle à la phonation. Une approche expérimentale est proposée afin d’étudier l'influence de la présence ce liquide sur des répliques de cordes vocales. Elle démontre que la pulvérisation d'eau a un impact sur les paramètres basique de la voix et sur leur perturbation. Un modèle théorique simplifié tenant compte de la présence de l'air et de l'eau est ensuite proposé et validé.Troisièmement, l'effet de l'asymétrie angulaire verticale des cordes vocales, dans le cas d'une paralysie unilatérale, sur l'interaction fluide-structure est évalué expérimentalement. Il est observé que la perte initiale de contact des cordes vocales entraîne une variation importante des caractéristiques de phonation et de leurs variations. Un modèle théorique simple est adapté pour permettre de prédire l'augmentation de la pression seuil de l'auto-oscillation des cordes vocales. Pour les applications cliniques futures, les résultats obtenus suggèrent la poursuite du développement du MSePGG et illustrent les multiples de causes potentielles de perturbation de la voix

Dmitriy Zakharov - One of the best experts on this subject based on the ideXlab platform.

  • improving speech synthesis quality for Voices created from an audiobook database
    International Conference on Speech and Computer, 2014
    Co-Authors: Pavel Chistikov, Dmitriy Zakharov, Andrey Talanov
    Abstract:

    This paper describes an approach to improving synthesized speech quality for Voices created by using an audiobook database. The data consist of a large amount of read speech by one speaker, which we matched with the corresponding book texts. The main problems with such a database are the following. First, the recordings were made at different times under different acoustic conditions, and the speaker reads the text with a variety of intonations and accents, which leads to very high Voice Parameter variability. Second, automatic techniques for sound file labeling make more errors due to the large variability of the database, especially as there can be mismatches between the text and the corresponding sound files. These problems dramatically affect speech synthesis quality, so a robust method for solving them is vital for Voices created using audiobooks. The approach described in the paper is based on statistical models of Voice Parameters and special algorithms of speech element concatenation and modification. Listening tests show that it strongly improves synthesized speech quality.

  • SPECOM - Improving Speech Synthesis Quality for Voices Created from an Audiobook Database
    Speech and Computer, 2014
    Co-Authors: Pavel Chistikov, Dmitriy Zakharov, Andrey Talanov
    Abstract:

    This paper describes an approach to improving synthesized speech quality for Voices created by using an audiobook database. The data consist of a large amount of read speech by one speaker, which we matched with the corresponding book texts. The main problems with such a database are the following. First, the recordings were made at different times under different acoustic conditions, and the speaker reads the text with a variety of intonations and accents, which leads to very high Voice Parameter variability. Second, automatic techniques for sound file labeling make more errors due to the large variability of the database, especially as there can be mismatches between the text and the corresponding sound files. These problems dramatically affect speech synthesis quality, so a robust method for solving them is vital for Voices created using audiobooks. The approach described in the paper is based on statistical models of Voice Parameters and special algorithms of speech element concatenation and modification. Listening tests show that it strongly improves synthesized speech quality.

Yuvraj Sharma - One of the best experts on this subject based on the ideXlab platform.

  • Discrimination of People with Parkinson (PWP) Disease on the basis of Voice Parameter Analysis
    International Journal of Computer Applications, 2014
    Co-Authors: Dixit Dixit, Vikas Mittal, Yuvraj Sharma
    Abstract:

    is the essential medium of man's communication in social as well as professional interactions. The human Voice also reflects the state of health in many medical conditions which leads Voice alterations in patients. This paper presents a Voice analysis approach for discriminating the People With Parkinson (PWP) on the basis of extracted Voice Parameters. Voice analysis basically deals with decomposition of Voice signal into Voice Parameters for processing the resulted features in desirable application. The features that are extracted in this paper are: frequency, pitch, Voice intensity, formant, speech rate and pulse functions like Jitter (local), Jitter (local, absolute), Jitter (rap), Jitter (ppq5), Jitter (ddp), Shimmer (local), Shimmer (local, dB), Shimmer (apq3), Shimmer (apq5), Shimmer (apq11), Shimmer (dda) and Harmonic coefficients. Keywordsanalysis technique, PWP, prosody features, Voice dysphonia, Voice Parameters, Hypokinetic dysarthria

  • Voice Parameter Analysis for the disease detection
    IOSR Journal of Electronics and Communication Engineering, 2014
    Co-Authors: Dixit Dixit, Vikas Mittal, Yuvraj Sharma
    Abstract:

    The analysis of the human Voice has arisen as an important area of study for its various applications in medical as well as engineering sciences. Voice analysis basically deals with extraction of some Parameters from Voice signal for processing of Voice in desirable applicability by using suitable techniques. This paper states the certain common medical conditions which affect Voice patterns of patients in evidence to leading research studies that had verified the Voice alterations as diagnosis symptom in respective medical conditions. As well as, a comparative study of Voice analysis techniques will be presented and special emphasis is given to certain prominent biomedical tools that are commercially available and are fundamentally based on Voice analysis technology. Recent advancements in the field of Voice analysis systems are also given.