Spoken Dialogue System

The Experts below are selected from a list of 7197 Experts worldwide ranked by ideXlab platform

Diane J. Litman - One of the best experts on this subject based on the ideXlab platform.

The relative impact of student affect on performance models in a Spoken Dialogue tutoring System

User Modeling and User-Adapted Interaction, 2008

Co-Authors: Kate Forbes-riley, Mihai Rotaru, Diane J. Litman

Abstract:

We hypothesize that student affect is a useful predictor of Spoken Dialogue System performance, relative to other parameters. We test this hypothesis in the context of our Spoken Dialogue tutoring System, where student learning is the primary performance metric. We first present our System and corpora, which have been annotated with several student affective states, student correctness and discourse structure. We then discuss unigram and bigram parameters derived from these annotations. The unigram parameters represent each annotation type individu- ally, as well as System-generic features. The bigram parameters represent annotation combinations, including student state sequences and student states in the discourse structure context. We then use these parameters to build learning models. First, we build simple models based on correlations between each of our parameters and learning. Our results suggest that our affect parameters are among our most useful predictors of learning, particularly in specific discourse structure contexts. Next, we use the PARADISE framework (multiple linear regression) to build complex learning models containing only the most useful subset of parameters. Our approach is a value-added one; we perform a number of model-building experiments, both with and without including our affect parameters, and then compare the performance of the models on the training and the test sets. Our results show that when included as inputs, our affect parameters are selected as predictors in most models, and many of these models show high generalizability in testing. Our results also show that overall, the affect-included models significantly outperform the affect-excluded models.

15 days free trial to Access Article
comparing synthesized versus pre recorded tutor speech in an intelligent tutoring Spoken Dialogue System

The Florida AI Research Society, 2006

Co-Authors: Katherine Forbesriley, Diane J. Litman, Scott Silliman, Joel Tetreault

Abstract:

We evaluate the impact of tutor voice quulity in the context of our intelligent tutoring Spoken Dialogue System. We first describe two versions of our System which yielded two corpora of human-computer tutoring Dialogues: one using a tutor voiee pre-recorded by a human, and the other using a low-cost texr-to-speech tutor voice. We then discuss the results of two-tailed t-tests comparing student learning gains, System usability, and Dialogue efficiency across the two corpora and across corpora subsets. Overall, our results suggest that tutor voice quality may have only a minor impact on these metrics in the context of our tutoring System. We find that tutor voice quality docs not impact learning gains, hut it may impact usability and efficiency for some corpora subsets. Copyright © 2006, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.

15 days free trial to Access Article
recognizing student emotions and attitudes on the basis of utterances in Spoken tutoring Dialogues with both human and computer tutors

Speech Communication, 2006

Co-Authors: Diane J. Litman, Katherine Forbesriley

Abstract:

While human tutors respond to both what a student says and to how the student says it, most tutorial Dialogue Systems cannot detect the student emotions and attitudes underlying an utterance. We present an empirical study investigating the feasibility of recognizing student state in two corpora of Spoken tutoring Dialogues, one with a human tutor, and one with a computer tutor. We first annotate student turns for negative, neutral and positive student states in both corpora. We then automatically extract acoustic–prosodic features from the student speech, and lexical items from the transcribed or recognized speech. We compare the results of machine learning experiments using these features alone, in combination, and with student and task dependent features, to predict student states. We also compare our results across human–human and human–computer Spoken tutoring Dialogues. Our results show significant improvements in prediction accuracy over relevant baselines, and provide a first step towards enhancing our intelligent tutoring Spoken Dialogue System to automatically recognize and adapt to student states.

15 days free trial to Access Article
ITSPOKE: An Intelligent Tutoring Spoken Dialogue System

Proc. of the Human Language Technology Conference: 4th Meeting of the North American Chapter of the Association for Computational Linguistics (HLT NAA, 2004

Co-Authors: Diane J. Litman, Scott Silliman

Abstract:

ITSPOKE is a Spoken Dialogue System that uses the Why2-Atlas text-based tutoring System as its "back-end". A student first types a natural language answer to a qualitative physics problem. ITSPOKE then engages the student in a Spoken Dialogue to provide feedback and correct misconceptions, and to elicit more complete explanations. We are using ITSPOKE to generate an empirically-based understanding of the ramifications of adding Spoken language capabilities to text-based Dialogue tutors.

15 days free trial to Access Article
optimizing Dialogue management with reinforcement learning experiments with the njfun System

Journal of Artificial Intelligence Research, 2002

Co-Authors: Satinder Singh, Diane J. Litman, Michael Kearns, Marilyn A Walker

Abstract:

Designing the Dialogue policy of a Spoken Dialogue System involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing a Dialogue policy, which addresses the technical challenges in applying reinforcement learning to a working Dialogue System with human users. We report on the design, construction and empirical evaluation of NJFun, an experimental Spoken Dialogue System that provides users with access to information about fun things to do in New Jersey. Our results show that by optimizing its performance via reinforcement learning, NJFun measurably improves System performance.

15 days free trial to Access Article

Tatsuya Kawahara - One of the best experts on this subject based on the ideXlab platform.

Spoken Dialogue System for a human like conversational robot erica

IWSDS, 2019

Co-Authors: Tatsuya Kawahara

Abstract:

This article gives an overview of our symbiotic human-robot interaction project, which aims at an autonomous android who behaves and interacts just like a human. A conversational android ERICA is designed to conduct several social roles focused on Spoken Dialogue, such as attentive listening (similar to counseling) and job interview. Design principles in developing these Spoken Dialogue Systems are described, in particular focused on the attentive listening System. Generation of backchannels, fillers and laughter is also addressed to make human-like conversation behaviors.

15 days free trial to Access Article
bayes risk based Dialogue management for document retrieval System with speech interface

Speech Communication, 2010

Co-Authors: Teruhisa Misu, Tatsuya Kawahara

Abstract:

We propose an efficient technique of Dialogue management for an information navigation System based on a document knowledge base. The System can use ASR N-best hypotheses and contextual information to perform robustly for fragmental speech input and erroneous output of automatic speech recognition (ASR). It also has several choices in generating responses or confirmations. We formulate the optimization of these choices based on a Bayes risk criterion, which is defined based on a reward for correct information presentation and a penalty for redundant turns. The parameters for the Dialogue management we propose can be adaptively tuned by online learning. We evaluated this strategy with our Spoken Dialogue System called ''Dialogue Navigator for Kyoto City'', which generates responses based on the document retrieval and also has question-answering capability. The effectiveness of the proposed framework was demonstrated by the increased success rate of Dialogue and the reduced number of turns for information access through an experiment with a large number of utterances by real users.

15 days free trial to Access Article
bayes risk based Dialogue management for document retrieval System with speech interface

International Conference on Computational Linguistics, 2008

Co-Authors: Teruhisa Misu, Tatsuya Kawahara

Abstract:

We propose an efficient Dialogue management for an information navigation System based on a document knowledge base with a Spoken Dialogue interface. In order to perform robustly for fragmental speech input and erroneous output of an automatic speech recognition (ASR), the System should selectively use N-best hypotheses of ASR and contextual information. The System also has several choices in generating responses or confirmations. In this work, we formulate the optimization of the choices based on a unified criterion: Bayes risk, which is defined based on reward for correct information presentation and penalty for redundant turns. We have evaluated this strategy with a Spoken Dialogue System which also has questionanswering capability. Effectiveness of the proposed framework was confirmed in the success rate of retrieval and the average number of turns.

15 days free trial to Access Article

Teruhisa Misu - One of the best experts on this subject based on the ideXlab platform.

bayes risk based Dialogue management for document retrieval System with speech interface

Speech Communication, 2010

Co-Authors: Teruhisa Misu, Tatsuya Kawahara

Abstract:

We propose an efficient technique of Dialogue management for an information navigation System based on a document knowledge base. The System can use ASR N-best hypotheses and contextual information to perform robustly for fragmental speech input and erroneous output of automatic speech recognition (ASR). It also has several choices in generating responses or confirmations. We formulate the optimization of these choices based on a Bayes risk criterion, which is defined based on a reward for correct information presentation and a penalty for redundant turns. The parameters for the Dialogue management we propose can be adaptively tuned by online learning. We evaluated this strategy with our Spoken Dialogue System called ''Dialogue Navigator for Kyoto City'', which generates responses based on the document retrieval and also has question-answering capability. The effectiveness of the proposed framework was demonstrated by the increased success rate of Dialogue and the reduced number of turns for information access through an experiment with a large number of utterances by real users.

15 days free trial to Access Article
bayes risk based Dialogue management for document retrieval System with speech interface

International Conference on Computational Linguistics, 2008

Co-Authors: Teruhisa Misu, Tatsuya Kawahara

Abstract:

We propose an efficient Dialogue management for an information navigation System based on a document knowledge base with a Spoken Dialogue interface. In order to perform robustly for fragmental speech input and erroneous output of an automatic speech recognition (ASR), the System should selectively use N-best hypotheses of ASR and contextual information. The System also has several choices in generating responses or confirmations. In this work, we formulate the optimization of the choices based on a unified criterion: Bayes risk, which is defined based on reward for correct information presentation and penalty for redundant turns. We have evaluated this strategy with a Spoken Dialogue System which also has questionanswering capability. Effectiveness of the proposed framework was confirmed in the success rate of retrieval and the average number of turns.

15 days free trial to Access Article

Katherine Forbesriley - One of the best experts on this subject based on the ideXlab platform.

comparing synthesized versus pre recorded tutor speech in an intelligent tutoring Spoken Dialogue System

The Florida AI Research Society, 2006

Co-Authors: Katherine Forbesriley, Diane J. Litman, Scott Silliman, Joel Tetreault

Abstract:

We evaluate the impact of tutor voice quulity in the context of our intelligent tutoring Spoken Dialogue System. We first describe two versions of our System which yielded two corpora of human-computer tutoring Dialogues: one using a tutor voiee pre-recorded by a human, and the other using a low-cost texr-to-speech tutor voice. We then discuss the results of two-tailed t-tests comparing student learning gains, System usability, and Dialogue efficiency across the two corpora and across corpora subsets. Overall, our results suggest that tutor voice quality may have only a minor impact on these metrics in the context of our tutoring System. We find that tutor voice quality docs not impact learning gains, hut it may impact usability and efficiency for some corpora subsets. Copyright © 2006, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.

15 days free trial to Access Article
recognizing student emotions and attitudes on the basis of utterances in Spoken tutoring Dialogues with both human and computer tutors

Speech Communication, 2006

Co-Authors: Diane J. Litman, Katherine Forbesriley

Abstract:

While human tutors respond to both what a student says and to how the student says it, most tutorial Dialogue Systems cannot detect the student emotions and attitudes underlying an utterance. We present an empirical study investigating the feasibility of recognizing student state in two corpora of Spoken tutoring Dialogues, one with a human tutor, and one with a computer tutor. We first annotate student turns for negative, neutral and positive student states in both corpora. We then automatically extract acoustic–prosodic features from the student speech, and lexical items from the transcribed or recognized speech. We compare the results of machine learning experiments using these features alone, in combination, and with student and task dependent features, to predict student states. We also compare our results across human–human and human–computer Spoken tutoring Dialogues. Our results show significant improvements in prediction accuracy over relevant baselines, and provide a first step towards enhancing our intelligent tutoring Spoken Dialogue System to automatically recognize and adapt to student states.

15 days free trial to Access Article

Steve Young - One of the best experts on this subject based on the ideXlab platform.

Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding

arXiv, 2016

Co-Authors: Lina M. Rojas Barahona, Nikola Mrkšić, Tsung Hsien Wen, Pei-hao Su, Stefan Ultes, Milica Gašić, Steve Young

Abstract:

This paper presents a deep learning architecture for the semantic decoder component of a Statistical Spoken Dialogue System. In a slot-filling Dialogue, the semantic decoder predicts the Dialogue act and a set of slot-value pairs from a set of n-best hypotheses returned by the Automatic Speech Recognition. Most current models for Spoken language understanding assume (i) word-aligned semantic annotations as in sequence taggers and (ii) delexicalisation, or a mapping of input words to domain-specific concepts using heuristics that try to capture morphological variation but that do not scale to other domains nor to language variation (e.g., morphology, synonyms, paraphrasing ). In this work the semantic decoder is trained using unaligned semantic annotations and it uses distributed semantic representation learning to overcome the limitations of explicit delexicalisation. The proposed architecture uses a convolutional neural network for the sentence representation and a long-short term memory network for the context representation. Results are presented for the publicly available DSTC2 corpus and an In-car corpus which is similar to DSTC2 but has a significantly higher word error rate (WER).

15 days free trial to Access Article
stochastic language generation in Dialogue using recurrent neural networks with convolutional sentence reranking

arXiv: Computation and Language, 2015

Co-Authors: Tsung Hsien Wen, Milica Gasic, Dongho Kim, Nikola Mrksic, David Vandyke, Steve Young

Abstract:

The natural language generation (NLG) component of a Spoken Dialogue System (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on. These limitations add significantly to development costs and make cross-domain, multi-lingual Dialogue Systems intractable. Moreover, human languages are context-aware. The most natural response should be directly learned from data rather than depending on predefined syntaxes or rules. This paper presents a statistical language generator based on a joint recurrent and convolutional neural network structure which can be trained on Dialogue act-utterance pairs without any semantic alignments or predefined grammar trees. Objective metrics suggest that this new model outperforms previous methods under the same experimental conditions. Results of an evaluation by human judges indicate that it produces not only high quality but linguistically varied utterances which are preferred compared to n-gram and rule-based Systems.

15 days free trial to Access Article
learning from real users rating Dialogue success with neural networks for reinforcement learning in Spoken Dialogue Systems

Conference of the International Speech Communication Association, 2015

Co-Authors: David Vandyke, Tsung Hsien Wen, Milica Gasic, Dongho Kim, Nikola Mrksic, Steve Young

Abstract:

Copyright © 2015 ISCA. To train a statistical Spoken Dialogue System (SDS) it is essential that an accurate method for measuring task success is available. To date training has relied on presenting a task to either simulated or paid users and inferring the Dialogue's success by observing whether this presented task was achieved or not. Our aim however is to be able to learn from real users acting under their own volition, in which case it is non-trivial to rate the success as any prior knowledge of the task is simply unavailable. User feedback may be utilised but has been found to be inconsistent. Hence, here we present two neural network models that evaluate a sequence of turn-level features to rate the success of a Dialogue. Importantly these models make no use of any prior knowledge of the user's task. The models are trained on Dialogues generated by a simulated user and the best model is then used to train a policy on-line which is shown to perform at least as well as a baseline System using prior knowledge of the user's task. We note that the models should also be of interest for evaluating SDS and for monitoring a Dialogue in rule-based SDS.

15 days free trial to Access Article
Uncertainty management for on-line optimisation of a POMDP-based large-scale Spoken Dialogue System

Proceedings of the Annual Conference of the International Speech Communication Association INTERSPEECH, 2011

Co-Authors: Lucie Daubigney, Senthilkumar Chandramohan, Matthieu Geist, Milica Gašić, Olivier Pietquin, Steve Young

Abstract:

The optimization of Dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in Spoken Dialogue Systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of Dialogues and hence most Systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the System to real users. Gaussian Processes (GP) for RL have recently been applied to Dialogue Systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task. Copyright © 2011 ISCA.

15 days free trial to Access Article
bayesian update of Dialogue state a pomdp framework for Spoken Dialogue Systems

Computer Speech & Language, 2010

Co-Authors: Blaise Thomson, Steve Young

Abstract:

This paper describes a statistically motivated framework for performing real-time Dialogue state updates and policy learning in a Spoken Dialogue System. The framework is based on the partially observable Markov decision process (POMDP), which provides a well-founded, statistical model of Spoken Dialogue management. However, exact belief state updates in a POMDP model are computationally intractable so approximate methods must be used. This paper presents a tractable method based on the loopy belief propagation algorithm. Various simplifications are made, which improve the efficiency significantly compared to the original algorithm as well as compared to other POMDP-based Dialogue state updating approaches. A second contribution of this paper is a method for learning in Spoken Dialogue Systems which uses a component-based policy with the episodic Natural Actor Critic algorithm. The framework proposed in this paper was tested on both simulations and in a user trial. Both indicated that using Bayesian updates of the Dialogue state significantly outperforms traditional definitions of the Dialogue state. Policy learning worked effectively and the learned policy outperformed all others on simulations. In user trials the learned policy was also competitive, although its optimality was less conclusive. Overall, the Bayesian update of Dialogue state framework was shown to be a feasible and effective approach to building real-world POMDP-based Dialogue Systems.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Diane J. Litman - One of the best experts on this subject based on the ideXlab platform.

The relative impact of student affect on performance models in a Spoken Dialogue tutoring System

comparing synthesized versus pre recorded tutor speech in an intelligent tutoring Spoken Dialogue System

recognizing student emotions and attitudes on the basis of utterances in Spoken tutoring Dialogues with both human and computer tutors

ITSPOKE: An Intelligent Tutoring Spoken Dialogue System

optimizing Dialogue management with reinforcement learning experiments with the njfun System

Tatsuya Kawahara - One of the best experts on this subject based on the ideXlab platform.

Spoken Dialogue System for a human like conversational robot erica

bayes risk based Dialogue management for document retrieval System with speech interface

bayes risk based Dialogue management for document retrieval System with speech interface

Teruhisa Misu - One of the best experts on this subject based on the ideXlab platform.

bayes risk based Dialogue management for document retrieval System with speech interface

bayes risk based Dialogue management for document retrieval System with speech interface

Katherine Forbesriley - One of the best experts on this subject based on the ideXlab platform.

comparing synthesized versus pre recorded tutor speech in an intelligent tutoring Spoken Dialogue System

recognizing student emotions and attitudes on the basis of utterances in Spoken tutoring Dialogues with both human and computer tutors

Steve Young - One of the best experts on this subject based on the ideXlab platform.

Exploiting Sentence and Context Representations in Deep Neural Models for Spoken Language Understanding

stochastic language generation in Dialogue using recurrent neural networks with convolutional sentence reranking

learning from real users rating Dialogue success with neural networks for reinforcement learning in Spoken Dialogue Systems

Uncertainty management for on-line optimisation of a POMDP-based large-scale Spoken Dialogue System

bayesian update of Dialogue state a pomdp framework for Spoken Dialogue Systems

Spoken Dialogue System

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Related terms

Diane J. Litman - One of the best experts on this subject based on the ideXlab platform.

Tatsuya Kawahara - One of the best experts on this subject based on the ideXlab platform.

Teruhisa Misu - One of the best experts on this subject based on the ideXlab platform.

Katherine Forbesriley - One of the best experts on this subject based on the ideXlab platform.

Steve Young - One of the best experts on this subject based on the ideXlab platform.

Related terms