The Experts below are selected from a list of 151788 Experts worldwide ranked by ideXlab platform
Wentau Yih - One of the best experts on this subject based on the ideXlab platform.
-
reconsider improved re ranking using span focused cross attention for Open Domain question answering
North American Chapter of the Association for Computational Linguistics, 2021Co-Authors: Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wentau YihAbstract:State-of-the-art Machine Reading Comprehension (MRC) models for Open-Domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, motivating the need for answer re-ranking. We develop a successful re-ranking approach (RECONSIDER) for span-extraction tasks that improves upon the performance of MRC models, even beyond large-scale pre-training. RECONSIDER is trained on positive and negative examples extracted from high confidence MRC model predictions, and uses in-passage span annotations to perform span-focused re-ranking over a smaller candidate set. As a result, RECONSIDER learns to eliminate close false positives, achieving a new extractive state of the art on four QA tasks, with 45.5% Exact Match accuracy on Natural Questions with real user questions, and 61.7% on TriviaQA. We will release all related data, models, and code.
-
reconsider re ranking using span focused cross attention for Open Domain question answering
arXiv: Computation and Language, 2020Co-Authors: Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wentau YihAbstract:State-of-the-art Machine Reading Comprehension (MRC) models for Open-Domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, motivating the need for answer re-ranking. We develop a simple and effective re-ranking approach (RECONSIDER) for span-extraction tasks, that improves upon the performance of large pre-trained MRC models. RECONSIDER is trained on positive and negative examples extracted from high confidence predictions of MRC models, and uses in-passage span annotations to perform span-focused re-ranking over a smaller candidate set. As a result, RECONSIDER learns to eliminate close false positive passages, and achieves a new state of the art on four QA tasks, including 45.5% Exact Match accuracy on Natural Questions with real user questions, and 61.7% on TriviaQA.
-
wikiqa a challenge dataset for Open Domain question answering
Empirical Methods in Natural Language Processing, 2015Co-Authors: Yi Yang, Wentau Yih, Christopher MeekAbstract:We describe the WIKIQA dataset, a new publicly available set of question and sentence pairs, collected and annotated for research on Open-Domain question answering. Most previous work on answer sentence selection focuses on a dataset created using the TREC-QA data, which includes editor-generated questions and candidate answer sentences selected by matching content words in the question. WIKIQA is constructed using a more natural process and is more than an order of magnitude larger than the previous dataset. In addition, the WIKIQA dataset also includes questions for which there are no correct sentences, enabling researchers to work on answer triggering, a critical component in any QA system. We compare several systems on the task of answer sentence selection on both datasets and also describe the performance of a system on the problem of answer triggering using the WIKIQA dataset.
-
Open Domain question answering via semantic enrichment
The Web Conference, 2015Co-Authors: Huan Sun, Wentau Yih, Chentse Tsai, Jingjing Liu, Mingwei ChangAbstract:Most recent question answering (QA) systems query large-scale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBs-executable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA performance. Specifically, to the best of our knowledge, we make the first attempt to link answer candidates to entities in Freebase, during answer candidate generation. Several remarkable advantages follow: (1) Redundancy among answer candidates is automatically reduced. (2) The types of an answer candidate can be effortlessly determined by those of its corresponding entity in Freebase. (3) Capitalizing on the rich information about entities in Freebase, we can develop semantic features for each answer candidate after linking them to Freebase. Particularly, we construct answer-type related features with two novel probabilistic models, which directly evaluate the appropriateness of an answer candidate's types under a given question. Overall, such semantic features turn out to play significant roles in determining the true answers from the large answer candidate pool. The experimental results show that across two testing datasets, our QA system achieves an 18%~54% improvement under F_1 metric, compared with various existing QA systems.
Srinivasan Iyer - One of the best experts on this subject based on the ideXlab platform.
-
reconsider improved re ranking using span focused cross attention for Open Domain question answering
North American Chapter of the Association for Computational Linguistics, 2021Co-Authors: Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wentau YihAbstract:State-of-the-art Machine Reading Comprehension (MRC) models for Open-Domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, motivating the need for answer re-ranking. We develop a successful re-ranking approach (RECONSIDER) for span-extraction tasks that improves upon the performance of MRC models, even beyond large-scale pre-training. RECONSIDER is trained on positive and negative examples extracted from high confidence MRC model predictions, and uses in-passage span annotations to perform span-focused re-ranking over a smaller candidate set. As a result, RECONSIDER learns to eliminate close false positives, achieving a new extractive state of the art on four QA tasks, with 45.5% Exact Match accuracy on Natural Questions with real user questions, and 61.7% on TriviaQA. We will release all related data, models, and code.
-
reconsider re ranking using span focused cross attention for Open Domain question answering
arXiv: Computation and Language, 2020Co-Authors: Srinivasan Iyer, Sewon Min, Yashar Mehdad, Wentau YihAbstract:State-of-the-art Machine Reading Comprehension (MRC) models for Open-Domain Question Answering (QA) are typically trained for span selection using distantly supervised positive examples and heuristically retrieved negative examples. This training scheme possibly explains empirical observations that these models achieve a high recall amongst their top few predictions, but a low overall accuracy, motivating the need for answer re-ranking. We develop a simple and effective re-ranking approach (RECONSIDER) for span-extraction tasks, that improves upon the performance of large pre-trained MRC models. RECONSIDER is trained on positive and negative examples extracted from high confidence predictions of MRC models, and uses in-passage span annotations to perform span-focused re-ranking over a smaller candidate set. As a result, RECONSIDER learns to eliminate close false positive passages, and achieves a new state of the art on four QA tasks, including 45.5% Exact Match accuracy on Natural Questions with real user questions, and 61.7% on TriviaQA.
-
juice a large scale distantly supervised dataset for Open Domain context based code generation
arXiv: Learning, 2019Co-Authors: Rajas Agashe, Srinivasan Iyer, Luke ZettlemoyerAbstract:Interactive programming with interleaved code snippet cells and natural language markdown is recently gaining popularity in the form of Jupyter notebooks, which accelerate prototyping and collaboration. To study code generation conditioned on a long context history, we present JuICe, a corpus of 1.5 million examples with a curated test set of 3.7K instances based on online programming assignments. Compared with existing contextual code generation datasets, JuICe provides refined human-curated data, Open-Domain code, and an order of magnitude more training data. Using JuICe, we train models for two tasks: (1) generation of the API call sequence in a code cell, and (2) full code cell generation, both conditioned on the NL-Code history up to a particular code cell. Experiments using current baseline code generation models show that both context and distant supervision aid in generation, and that the dataset is challenging for current systems.
Yoshihiro Matsuo - One of the best experts on this subject based on the ideXlab platform.
-
Towards an Open-Domain conversational system fully based on natural language processing
Proceedings of COLING 2014 the 25th International Conference on Computational Linguistics: Technical Papers, 2014Co-Authors: Ryuichiro Higashinaka, Hiroaki Sugiyama, Kenji Imamura, Nozomi Kobayashi, Toru Hirano, Chiaki Miyazaki, Toyomi Meguro, Toshiro Makino, Yoshihiro MatsuoAbstract:This paper proposes an architecture for an Open-Domain conversational system and evaluates an implemented system. The proposed architecture is fully composed of modules based on natu-ral language processing techniques. Experimental results using human subjects show that our architecture achieves significantly better naturalness than a retrieval-based baseline and that its naturalness is close to that of a rule-based system using 149K hand-crafted rules.
Yoshua Bengio - One of the best experts on this subject based on the ideXlab platform.
-
a large scale Open Domain mixed interface dialogue based its for stem
Artificial Intelligence in Education, 2020Co-Authors: Iulian Vlad Serban, Varun Gupta, Ekaterina Kochmar, Robert Belfer, Joelle Pineau, Aaron Courville, Laurent Charlin, Yoshua BengioAbstract:We present Korbit, a large-scale, Open-Domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlike other ITS, a teacher can develop new learning modules for Korbit in a matter of hours. To facilitate learning across a wide range of STEM subjects, Korbit uses a mixed-interface, which includes videos, interactive dialogue-based exercises, question-answering, conceptual diagrams, mathematical exercises and gamification elements. Korbit has been built to scale to millions of students, by utilizing a state-of-the-art cloud-based micro-service architecture. Korbit launched its first course in 2019 and has over 7, 000 students have enrolled. Although Korbit was designed to be Open-Domain and highly scalable, A/B testing experiments with real-world students demonstrate that both student learning outcomes and student motivation are substantially improved compared to typical online courses.
-
a large scale Open Domain mixed interface dialogue based its for stem
arXiv: Computers and Society, 2020Co-Authors: Iulian Vlad Serban, Varun Gupta, Ekaterina Kochmar, Robert Belfer, Joelle Pineau, Aaron Courville, Laurent Charlin, Yoshua BengioAbstract:We present Korbit, a large-scale, Open-Domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlike other ITS, a teacher can develop new learning modules for Korbit in a matter of hours. To facilitate learning across a widerange of STEM subjects, Korbit uses a mixed-interface, which includes videos, interactive dialogue-based exercises, question-answering, conceptual diagrams, mathematical exercises and gamification elements. Korbit has been built to scale to millions of students, by utilizing a state-of-the-art cloud-based micro-service architecture. Korbit launched its first course in 2019 on machine learning, and since then over 7,000 students have enrolled. Although Korbit was designed to be Open-Domain and highly scalable, A/B testing experiments with real-world students demonstrate that both student learning outcomes and student motivation are substantially improved compared to typical online courses.
Oliver Lemon - One of the best experts on this subject based on the ideXlab platform.
-
it s good to chat evaluation and design guidelines for combining Open Domain social conversation with task based dialogue in intelligent buildings
Intelligent Virtual Agents, 2020Co-Authors: Nancie Gunson, Weronika Sieinska, Christopher Walsh, Christian Dondrup, Oliver LemonAbstract:We present and evaluate a deployed conversational AI system that acts as a host of a working public building on a university campus. The system combines Open-Domain social chat with task-based conversation regarding navigation in the building, live resource updates (e.g. available computers), and events in the building. We investigated the impact of Open-Domain social chat on task completion and user preferences by comparing the combined system with a task-only version. We find that there is no significant difference in task completion or several aspects of user preference between the two systems, but that users would be significantly happier to talk to the task-only system in the future. This suggests that the "walk-up" public setting and workplace nature of the environment creates a markedly different use case to the in-home, and more individual and private "companion/assistant" setting which is commonly assumed for systems like Alexa. We discuss the implications for the design of conversational systems in other public settings.