Telephone Conversation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 321 Experts worldwide ranked by ideXlab platform

Renato De Mori - One of the best experts on this subject based on the ideXlab platform.

  • denoised bottleneck features from deep autoencoders for Telephone Conversation analysis
    IEEE Transactions on Audio Speech and Language Processing, 2017
    Co-Authors: Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori
    Abstract:

    Automatic transcription of spoken documents is affected by automatic transcription errors that are especially frequent when speech is acquired in severe noisy conditions. Automatic speech recognition errors induce errors in the linguistic features used for a variety of natural language processing tasks. Recently, denoisng autoencoders (DAE) and stacked autoencoders (SAE) have been proposed with interesting results for acoustic feature denoising tasks. This paper deals with the recovery of corrupted linguistic features in spoken documents. Solutions based on DAEs and SAEs are considered and evaluated in a spoken Conversation analysis task. In order to improve Conversation theme classification accuracy, the possibility of combining abstractions obtained from manual and automatic transcription features is considered. As a result, two original representations of highly imperfect spoken documents are introduced. They are based on bottleneck features of a supervised autoencoder that takes advantage of both noisy and clean transcriptions to improve the robustness of error prone representations. Experimental results on a spoken Conversation theme identification task show substantial accuracy improvements obtained with the proposed recovery of corrupted features.

  • Improving dialogue classification using a topic space representation and a Gaussian classifier based on the decision rule
    2014 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Pierre-michel Bousquet, Renato De Mori
    Abstract:

    In this paper, we study the impact of dialogue representations and classification methods in the task of theme identification of Telephone Conversation services having highly imperfect automatic transcriptions. Two dialogue representations are firstly compared: the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini) method and the Latent Dirichlet Allocation (LDA) approach. We then propose to study an original classification method that takes advantage of the LDA topic space representation, highlighted as the best dialogue representation. To do so, two assumptions about topic representation led us to choose a Gaussian process (GP) based method. This approach is compared with a Support Vector Machine (SVM) classification method. Results show that the GP approach is a better solution to deal with the multiple theme complexity of a dialogue, no matter the conditions studied (manual or automatic transcriptions). We finally discuss the impact of the topic space reduction on the classification accuracy.

  • I-vector based Representation of Highly Imperfect Automatic Transcriptions
    2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Driss Matrouf, Renato De Mori
    Abstract:

    The performance of Automatic Speech Recognition (ASR) systems drops dramatically when used in noisy environments. Speech analytics suffer from this poor quality of automatic transcriptions. In this paper, we seek to identify themes from dialogues of Telephone Conversation services using multiple topic-spaces estimated with a Latent Dirichlet Allocation (LDA) approach. This technique consists in estimating several topic models that offer different views of the document. Unfortunately, such a multi-model approach also introduces additional vari-abilities due to the model diversity. We propose to extract the useful information from the full model-set by using an i-vector based approach, previously developed in the context of speaker recognition. Experiments are conducted on the DECODA corpus , that contains records from the call center of the Paris Transportation Company. Results show the effectiveness of the proposed representation paradigm, our identification system reaching an accuracy of 84.7%, with a gain of 3.3 points compared to the baseline.

Shaoping Ma - One of the best experts on this subject based on the ideXlab platform.

  • Multi-grained role labeling based on multi-modality information for real customer service Telephone Conversation
    IJCAI International Joint Conference on Artificial Intelligence, 2016
    Co-Authors: Weizhi Ma, Yiqun Liu, Min Zhang, Shaoping Ma
    Abstract:

    Large-scale customer service call records include lots of valuable information for business intelligence. However, the analysis of those records has not utilized in the big data era before. There are two fundamental problems before mining and analyses: 1) The Telephone Conversation is mixed with words of agents and users which have to be recognized before analysis; 2) The speakers in Conversation are not in a pre-defined set. These problems are new challenges which have not been well studied in the previous work. In this paper, we propose a four-phase framework for role labeling in real customer service Telephone Conversation, with the benefit of integrating multi-modality features, i.e., both low-level acoustic features and semantic-level textual features. Firstly, we conduct ΔBayesian Information Criterion (Δ BIC) based speaker diarization to get two segments clusters from an audio stream. Secondly, the segments are transferred into text in an Automatic Speech Recognition (ASR) phase with a deep learning model DNN-HMM. Thirdly, by integrating acoustic and textual features, dialog level role labeling is proposed to map the two clusters into the agent and the user. Finally, sentence level role correction is designed in order to label results correctly in a fine-grained notion, which reduces the errors made in previous phases. The proposed framework is tested on two real datasets: mobile and bank customer service calls datasets. The precision of dialog level labeling is over 99.0%. On the sentence level, the accuracy of labeling reaches 90.4%, greatly outperforming traditional acoustic features based method which achieves only 78.5% in accuracy.

Mohamed Morchid - One of the best experts on this subject based on the ideXlab platform.

  • denoised bottleneck features from deep autoencoders for Telephone Conversation analysis
    IEEE Transactions on Audio Speech and Language Processing, 2017
    Co-Authors: Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori
    Abstract:

    Automatic transcription of spoken documents is affected by automatic transcription errors that are especially frequent when speech is acquired in severe noisy conditions. Automatic speech recognition errors induce errors in the linguistic features used for a variety of natural language processing tasks. Recently, denoisng autoencoders (DAE) and stacked autoencoders (SAE) have been proposed with interesting results for acoustic feature denoising tasks. This paper deals with the recovery of corrupted linguistic features in spoken documents. Solutions based on DAEs and SAEs are considered and evaluated in a spoken Conversation analysis task. In order to improve Conversation theme classification accuracy, the possibility of combining abstractions obtained from manual and automatic transcription features is considered. As a result, two original representations of highly imperfect spoken documents are introduced. They are based on bottleneck features of a supervised autoencoder that takes advantage of both noisy and clean transcriptions to improve the robustness of error prone representations. Experimental results on a spoken Conversation theme identification task show substantial accuracy improvements obtained with the proposed recovery of corrupted features.

  • Improving dialogue classification using a topic space representation and a Gaussian classifier based on the decision rule
    2014 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Pierre-michel Bousquet, Renato De Mori
    Abstract:

    In this paper, we study the impact of dialogue representations and classification methods in the task of theme identification of Telephone Conversation services having highly imperfect automatic transcriptions. Two dialogue representations are firstly compared: the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini) method and the Latent Dirichlet Allocation (LDA) approach. We then propose to study an original classification method that takes advantage of the LDA topic space representation, highlighted as the best dialogue representation. To do so, two assumptions about topic representation led us to choose a Gaussian process (GP) based method. This approach is compared with a Support Vector Machine (SVM) classification method. Results show that the GP approach is a better solution to deal with the multiple theme complexity of a dialogue, no matter the conditions studied (manual or automatic transcriptions). We finally discuss the impact of the topic space reduction on the classification accuracy.

  • A LDA-Based Topic Classification Approach from Highly Imperfect Automatic Transcriptions
    2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès
    Abstract:

    Although the current transcription systems could achieve high recognition performance, they still have a lot of difficulties to transcribe speech in very noisy environments. The transcription quality has a direct impact on classification tasks using text features. In this paper, we propose to identify themes of Telephone Conversation services with the classical Term Frequency-Inverse Document Frequency using Gini purity criteria (TF-IDF-Gini) method and with a Latent Dirichlet Allocation (LDA) approach. These approaches are coupled with a Support Vector Machine (SVM) classification to resolve theme identification problem. Results show the effectiveness of the proposed LDA-based method compared to the classical TF-IDF-Gini approach in the context of highly imperfect automatic transcriptions. Finally , we discuss the impact of discriminative and non-discriminative words extracted by both methods in terms of transcription accuracy.

  • I-vector based Representation of Highly Imperfect Automatic Transcriptions
    2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Driss Matrouf, Renato De Mori
    Abstract:

    The performance of Automatic Speech Recognition (ASR) systems drops dramatically when used in noisy environments. Speech analytics suffer from this poor quality of automatic transcriptions. In this paper, we seek to identify themes from dialogues of Telephone Conversation services using multiple topic-spaces estimated with a Latent Dirichlet Allocation (LDA) approach. This technique consists in estimating several topic models that offer different views of the document. Unfortunately, such a multi-model approach also introduces additional vari-abilities due to the model diversity. We propose to extract the useful information from the full model-set by using an i-vector based approach, previously developed in the context of speaker recognition. Experiments are conducted on the DECODA corpus , that contains records from the call center of the Paris Transportation Company. Results show the effectiveness of the proposed representation paradigm, our identification system reaching an accuracy of 84.7%, with a gain of 3.3 points compared to the baseline.

Georges Linarès - One of the best experts on this subject based on the ideXlab platform.

  • denoised bottleneck features from deep autoencoders for Telephone Conversation analysis
    IEEE Transactions on Audio Speech and Language Processing, 2017
    Co-Authors: Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori
    Abstract:

    Automatic transcription of spoken documents is affected by automatic transcription errors that are especially frequent when speech is acquired in severe noisy conditions. Automatic speech recognition errors induce errors in the linguistic features used for a variety of natural language processing tasks. Recently, denoisng autoencoders (DAE) and stacked autoencoders (SAE) have been proposed with interesting results for acoustic feature denoising tasks. This paper deals with the recovery of corrupted linguistic features in spoken documents. Solutions based on DAEs and SAEs are considered and evaluated in a spoken Conversation analysis task. In order to improve Conversation theme classification accuracy, the possibility of combining abstractions obtained from manual and automatic transcription features is considered. As a result, two original representations of highly imperfect spoken documents are introduced. They are based on bottleneck features of a supervised autoencoder that takes advantage of both noisy and clean transcriptions to improve the robustness of error prone representations. Experimental results on a spoken Conversation theme identification task show substantial accuracy improvements obtained with the proposed recovery of corrupted features.

  • Improving dialogue classification using a topic space representation and a Gaussian classifier based on the decision rule
    2014 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Pierre-michel Bousquet, Renato De Mori
    Abstract:

    In this paper, we study the impact of dialogue representations and classification methods in the task of theme identification of Telephone Conversation services having highly imperfect automatic transcriptions. Two dialogue representations are firstly compared: the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini) method and the Latent Dirichlet Allocation (LDA) approach. We then propose to study an original classification method that takes advantage of the LDA topic space representation, highlighted as the best dialogue representation. To do so, two assumptions about topic representation led us to choose a Gaussian process (GP) based method. This approach is compared with a Support Vector Machine (SVM) classification method. Results show that the GP approach is a better solution to deal with the multiple theme complexity of a dialogue, no matter the conditions studied (manual or automatic transcriptions). We finally discuss the impact of the topic space reduction on the classification accuracy.

  • A LDA-Based Topic Classification Approach from Highly Imperfect Automatic Transcriptions
    2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès
    Abstract:

    Although the current transcription systems could achieve high recognition performance, they still have a lot of difficulties to transcribe speech in very noisy environments. The transcription quality has a direct impact on classification tasks using text features. In this paper, we propose to identify themes of Telephone Conversation services with the classical Term Frequency-Inverse Document Frequency using Gini purity criteria (TF-IDF-Gini) method and with a Latent Dirichlet Allocation (LDA) approach. These approaches are coupled with a Support Vector Machine (SVM) classification to resolve theme identification problem. Results show the effectiveness of the proposed LDA-based method compared to the classical TF-IDF-Gini approach in the context of highly imperfect automatic transcriptions. Finally , we discuss the impact of discriminative and non-discriminative words extracted by both methods in terms of transcription accuracy.

  • I-vector based Representation of Highly Imperfect Automatic Transcriptions
    2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Driss Matrouf, Renato De Mori
    Abstract:

    The performance of Automatic Speech Recognition (ASR) systems drops dramatically when used in noisy environments. Speech analytics suffer from this poor quality of automatic transcriptions. In this paper, we seek to identify themes from dialogues of Telephone Conversation services using multiple topic-spaces estimated with a Latent Dirichlet Allocation (LDA) approach. This technique consists in estimating several topic models that offer different views of the document. Unfortunately, such a multi-model approach also introduces additional vari-abilities due to the model diversity. We propose to extract the useful information from the full model-set by using an i-vector based approach, previously developed in the context of speaker recognition. Experiments are conducted on the DECODA corpus , that contains records from the call center of the Paris Transportation Company. Results show the effectiveness of the proposed representation paradigm, our identification system reaching an accuracy of 84.7%, with a gain of 3.3 points compared to the baseline.

Richard Dufour - One of the best experts on this subject based on the ideXlab platform.

  • denoised bottleneck features from deep autoencoders for Telephone Conversation analysis
    IEEE Transactions on Audio Speech and Language Processing, 2017
    Co-Authors: Killian Janod, Mohamed Morchid, Richard Dufour, Georges Linarès, Renato De Mori
    Abstract:

    Automatic transcription of spoken documents is affected by automatic transcription errors that are especially frequent when speech is acquired in severe noisy conditions. Automatic speech recognition errors induce errors in the linguistic features used for a variety of natural language processing tasks. Recently, denoisng autoencoders (DAE) and stacked autoencoders (SAE) have been proposed with interesting results for acoustic feature denoising tasks. This paper deals with the recovery of corrupted linguistic features in spoken documents. Solutions based on DAEs and SAEs are considered and evaluated in a spoken Conversation analysis task. In order to improve Conversation theme classification accuracy, the possibility of combining abstractions obtained from manual and automatic transcription features is considered. As a result, two original representations of highly imperfect spoken documents are introduced. They are based on bottleneck features of a supervised autoencoder that takes advantage of both noisy and clean transcriptions to improve the robustness of error prone representations. Experimental results on a spoken Conversation theme identification task show substantial accuracy improvements obtained with the proposed recovery of corrupted features.

  • Improving dialogue classification using a topic space representation and a Gaussian classifier based on the decision rule
    2014 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Pierre-michel Bousquet, Renato De Mori
    Abstract:

    In this paper, we study the impact of dialogue representations and classification methods in the task of theme identification of Telephone Conversation services having highly imperfect automatic transcriptions. Two dialogue representations are firstly compared: the classical Term Frequency-Inverse Document Frequency with Gini purity criteria (TF-IDF-Gini) method and the Latent Dirichlet Allocation (LDA) approach. We then propose to study an original classification method that takes advantage of the LDA topic space representation, highlighted as the best dialogue representation. To do so, two assumptions about topic representation led us to choose a Gaussian process (GP) based method. This approach is compared with a Support Vector Machine (SVM) classification method. Results show that the GP approach is a better solution to deal with the multiple theme complexity of a dialogue, no matter the conditions studied (manual or automatic transcriptions). We finally discuss the impact of the topic space reduction on the classification accuracy.

  • A LDA-Based Topic Classification Approach from Highly Imperfect Automatic Transcriptions
    2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès
    Abstract:

    Although the current transcription systems could achieve high recognition performance, they still have a lot of difficulties to transcribe speech in very noisy environments. The transcription quality has a direct impact on classification tasks using text features. In this paper, we propose to identify themes of Telephone Conversation services with the classical Term Frequency-Inverse Document Frequency using Gini purity criteria (TF-IDF-Gini) method and with a Latent Dirichlet Allocation (LDA) approach. These approaches are coupled with a Support Vector Machine (SVM) classification to resolve theme identification problem. Results show the effectiveness of the proposed LDA-based method compared to the classical TF-IDF-Gini approach in the context of highly imperfect automatic transcriptions. Finally , we discuss the impact of discriminative and non-discriminative words extracted by both methods in terms of transcription accuracy.

  • I-vector based Representation of Highly Imperfect Automatic Transcriptions
    2014
    Co-Authors: Mohamed Morchid, Richard Dufour, Georges Linarès, Mohamed Bouallegue, Driss Matrouf, Renato De Mori
    Abstract:

    The performance of Automatic Speech Recognition (ASR) systems drops dramatically when used in noisy environments. Speech analytics suffer from this poor quality of automatic transcriptions. In this paper, we seek to identify themes from dialogues of Telephone Conversation services using multiple topic-spaces estimated with a Latent Dirichlet Allocation (LDA) approach. This technique consists in estimating several topic models that offer different views of the document. Unfortunately, such a multi-model approach also introduces additional vari-abilities due to the model diversity. We propose to extract the useful information from the full model-set by using an i-vector based approach, previously developed in the context of speaker recognition. Experiments are conducted on the DECODA corpus , that contains records from the call center of the Paris Transportation Company. Results show the effectiveness of the proposed representation paradigm, our identification system reaching an accuracy of 84.7%, with a gain of 3.3 points compared to the baseline.