Personal Assistant

The Experts below are selected from a list of 360 Experts worldwide ranked by ideXlab platform

Lingjia Tang - One of the best experts on this subject based on the ideXlab platform.

baymax qos awareness and increased utilization for non preemptive accelerators in warehouse scale computers

Architectural Support for Programming Languages and Operating Systems, 2016

Co-Authors: Quan Chen, Jason Mars, Hailong Yang, Lingjia Tang

Abstract:

Modern warehouse-scale computers (WSCs) are being outfitted with accelerators to provide the significant compute required by emerging intelligent Personal Assistant (IPA) workloads such as voice recognition, image classification, and natural language processing. It is well known that the diurnal user access pattern of user-facing services provides a strong incentive to co-locate applications for better accelerator utilization and efficiency, and prior work has focused on enabling co-location on multicore processors. However, interference when co-locating applications on non-preemptive accelerators is fundamentally different than contention on multi-core CPUs and introduces a new set of challenges to reduce QoS violation. To address this open problem, we first identify the underlying causes for QoS violation in accelerator-outfitted servers. Our experiments show that queuing delay for the compute resources and PCI-e bandwidth contention for data transfer are the main two factors that contribute to the long tails of user-facing applications. We then present Baymax, a runtime system that orchestrates the execution of compute tasks from different applications and mitigates PCI-e bandwidth contention to deliver the required QoS for user-facing applications and increase the accelerator utilization. Using DjiNN, a deep neural network service, Sirius, an end-to-end IPA workload, and traditional applications on a Nvidia K40 GPU, our evaluation shows that Baymax improves the accelerator utilization by 91.3% while achieving the desired 99%-ile latency target for for user-facing applications. In fact, Baymax reduces the 99%-ile latency of user-facing applications by up to 195x over default execution.

15 days free trial to Access Article
sirius an open end to end voice and vision Personal Assistant and its implications for future warehouse scale computers

Architectural Support for Programming Languages and Operating Systems, 2015

Co-Authors: Johann Hauswald, Michael A Laurenzano, Yunqi Zhang, Austin Rovinski, Arjun Khurana, Ronald G Dreslinski, Trevor Mudge, Vinicius Petrucci, Lingjia Tang, Jason Mars

Abstract:

As user demand scales for intelligent Personal Assistants (IPAs) such as Apple's Siri, Google's Google Now, and Microsoft's Cortana, we are approaching the computational limits of current datacenter architectures. It is an open question how future server architectures should evolve to enable this emerging class of applications, and the lack of an open-source IPA workload is an obstacle in addressing this question. In this paper, we present the design of Sirius, an open end-to-end IPA web-service application that accepts queries in the form of voice and images, and responds with natural language. We then use this workload to investigate the implications of four points in the design space of future accelerator-based server architectures spanning traditional CPUs, GPUs, manycore throughput co-processors, and FPGAs. To investigate future server designs for Sirius, we decompose Sirius into a suite of 7 benchmarks (Sirius Suite) comprising the computationally intensive bottlenecks of Sirius. We port Sirius Suite to a spectrum of accelerator platforms and use the performance and power trade-offs across these platforms to perform a total cost of ownership (TCO) analysis of various server design points. In our study, we find that accelerators are critical for the future scalability of IPA services. Our results show that GPU- and FPGA-accelerated servers improve the query latency on average by 10x and 16x. For a given throughput, GPU- and FPGA-accelerated servers can reduce the TCO of datacenters by 2.6x and 1.4x, respectively.

15 days free trial to Access Article

Jason Mars - One of the best experts on this subject based on the ideXlab platform.

baymax qos awareness and increased utilization for non preemptive accelerators in warehouse scale computers

Architectural Support for Programming Languages and Operating Systems, 2016

Co-Authors: Quan Chen, Jason Mars, Hailong Yang, Lingjia Tang

Abstract:

Modern warehouse-scale computers (WSCs) are being outfitted with accelerators to provide the significant compute required by emerging intelligent Personal Assistant (IPA) workloads such as voice recognition, image classification, and natural language processing. It is well known that the diurnal user access pattern of user-facing services provides a strong incentive to co-locate applications for better accelerator utilization and efficiency, and prior work has focused on enabling co-location on multicore processors. However, interference when co-locating applications on non-preemptive accelerators is fundamentally different than contention on multi-core CPUs and introduces a new set of challenges to reduce QoS violation. To address this open problem, we first identify the underlying causes for QoS violation in accelerator-outfitted servers. Our experiments show that queuing delay for the compute resources and PCI-e bandwidth contention for data transfer are the main two factors that contribute to the long tails of user-facing applications. We then present Baymax, a runtime system that orchestrates the execution of compute tasks from different applications and mitigates PCI-e bandwidth contention to deliver the required QoS for user-facing applications and increase the accelerator utilization. Using DjiNN, a deep neural network service, Sirius, an end-to-end IPA workload, and traditional applications on a Nvidia K40 GPU, our evaluation shows that Baymax improves the accelerator utilization by 91.3% while achieving the desired 99%-ile latency target for for user-facing applications. In fact, Baymax reduces the 99%-ile latency of user-facing applications by up to 195x over default execution.

15 days free trial to Access Article
sirius an open end to end voice and vision Personal Assistant and its implications for future warehouse scale computers

Architectural Support for Programming Languages and Operating Systems, 2015

Co-Authors: Johann Hauswald, Michael A Laurenzano, Yunqi Zhang, Austin Rovinski, Arjun Khurana, Ronald G Dreslinski, Trevor Mudge, Vinicius Petrucci, Lingjia Tang, Jason Mars

Abstract:

As user demand scales for intelligent Personal Assistants (IPAs) such as Apple's Siri, Google's Google Now, and Microsoft's Cortana, we are approaching the computational limits of current datacenter architectures. It is an open question how future server architectures should evolve to enable this emerging class of applications, and the lack of an open-source IPA workload is an obstacle in addressing this question. In this paper, we present the design of Sirius, an open end-to-end IPA web-service application that accepts queries in the form of voice and images, and responds with natural language. We then use this workload to investigate the implications of four points in the design space of future accelerator-based server architectures spanning traditional CPUs, GPUs, manycore throughput co-processors, and FPGAs. To investigate future server designs for Sirius, we decompose Sirius into a suite of 7 benchmarks (Sirius Suite) comprising the computationally intensive bottlenecks of Sirius. We port Sirius Suite to a spectrum of accelerator platforms and use the performance and power trade-offs across these platforms to perform a total cost of ownership (TCO) analysis of various server design points. In our study, we find that accelerators are critical for the future scalability of IPA services. Our results show that GPU- and FPGA-accelerated servers improve the query latency on average by 10x and 16x. For a given throughput, GPU- and FPGA-accelerated servers can reduce the TCO of datacenters by 2.6x and 1.4x, respectively.

15 days free trial to Access Article

Melinda Gervasio - One of the best experts on this subject based on the ideXlab platform.

An intelligent Personal Assistant for task and time management

AI Magazine, 2007

Co-Authors: Karen Myers, Pauline Berry, Kqn Conley, Jim Blythe, Melinda Gervasio

Abstract:

We describe an intelligent Personal Assistant that has been developed to aid a busy knowl- edge worker in managing time commitments and performing tasks. The design of the system was motivated by the complementary objec- tives of (1) relieving the user of routine tasks, thus allowing her to focus on tasks that critical- ly require human problem-solving skills, and (2) intervening in situations where cognitive over- load leads to oversights or mistakes by the user. The system draws on a diverse set of Al tech- nologies that are linked within a Belief-Desire- Intention (BDI) agent system. Although the sys- tem provides a number of automated functions, the overall framework is highly user centric in its support for human needs, responsiveness to human inputs, and adaptivity to user working style and preferences. typical

15 days free trial to Access Article
deploying a Personalized time management agent

Adaptive Agents and Multi-Agents Systems, 2006

Co-Authors: Pauline M Berry, Melinda Gervasio, Bart Peintner, Ken Conley, Tomas E Uribe, Neil Yorkesmith

Abstract:

We report on our ongoing practical experience in designing, implementing, and deploying PTIME, a Personalized agent for time management and meeting scheduling in an open, multi-agent environment. In developing PTIME as part of a larger assistive agent called CALO, we have faced numerous challenges, including usability, multi-agent coordination, scalable constraint reasoning, robust execution, and unobtrusive learning. Our research advances basic solutions to the fundamental problems; however, integrating PTIME into a deployed system has raised other important issues for the successful adoption of new technology. As a Personal Assistant, PTIME must integrate easily into a user's real environment, support her normal workflow, respect her authority and privacy, provide natural user interfaces, and handle the issues that arise with deploying such a system in an open environment.

15 days free trial to Access Article

Karen Myers - One of the best experts on this subject based on the ideXlab platform.

An intelligent Personal Assistant for task and time management

AI Magazine, 2007

Co-Authors: Karen Myers, Pauline Berry, Kqn Conley, Jim Blythe, Melinda Gervasio

Abstract:

We describe an intelligent Personal Assistant that has been developed to aid a busy knowl- edge worker in managing time commitments and performing tasks. The design of the system was motivated by the complementary objec- tives of (1) relieving the user of routine tasks, thus allowing her to focus on tasks that critical- ly require human problem-solving skills, and (2) intervening in situations where cognitive over- load leads to oversights or mistakes by the user. The system draws on a diverse set of Al tech- nologies that are linked within a Belief-Desire- Intention (BDI) agent system. Although the sys- tem provides a number of automated functions, the overall framework is highly user centric in its support for human needs, responsiveness to human inputs, and adaptivity to user working style and preferences. typical

15 days free trial to Access Article

Quan Chen - One of the best experts on this subject based on the ideXlab platform.

baymax qos awareness and increased utilization for non preemptive accelerators in warehouse scale computers

Architectural Support for Programming Languages and Operating Systems, 2016

Co-Authors: Quan Chen, Jason Mars, Hailong Yang, Lingjia Tang

Abstract:

Modern warehouse-scale computers (WSCs) are being outfitted with accelerators to provide the significant compute required by emerging intelligent Personal Assistant (IPA) workloads such as voice recognition, image classification, and natural language processing. It is well known that the diurnal user access pattern of user-facing services provides a strong incentive to co-locate applications for better accelerator utilization and efficiency, and prior work has focused on enabling co-location on multicore processors. However, interference when co-locating applications on non-preemptive accelerators is fundamentally different than contention on multi-core CPUs and introduces a new set of challenges to reduce QoS violation. To address this open problem, we first identify the underlying causes for QoS violation in accelerator-outfitted servers. Our experiments show that queuing delay for the compute resources and PCI-e bandwidth contention for data transfer are the main two factors that contribute to the long tails of user-facing applications. We then present Baymax, a runtime system that orchestrates the execution of compute tasks from different applications and mitigates PCI-e bandwidth contention to deliver the required QoS for user-facing applications and increase the accelerator utilization. Using DjiNN, a deep neural network service, Sirius, an end-to-end IPA workload, and traditional applications on a Nvidia K40 GPU, our evaluation shows that Baymax improves the accelerator utilization by 91.3% while achieving the desired 99%-ile latency target for for user-facing applications. In fact, Baymax reduces the 99%-ile latency of user-facing applications by up to 195x over default execution.

15 days free trial to Access Article

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

Lingjia Tang - One of the best experts on this subject based on the ideXlab platform.

baymax qos awareness and increased utilization for non preemptive accelerators in warehouse scale computers

sirius an open end to end voice and vision Personal Assistant and its implications for future warehouse scale computers

Jason Mars - One of the best experts on this subject based on the ideXlab platform.

baymax qos awareness and increased utilization for non preemptive accelerators in warehouse scale computers

sirius an open end to end voice and vision Personal Assistant and its implications for future warehouse scale computers

Melinda Gervasio - One of the best experts on this subject based on the ideXlab platform.

An intelligent Personal Assistant for task and time management

deploying a Personalized time management agent

Karen Myers - One of the best experts on this subject based on the ideXlab platform.

An intelligent Personal Assistant for task and time management

Quan Chen - One of the best experts on this subject based on the ideXlab platform.

baymax qos awareness and increased utilization for non preemptive accelerators in warehouse scale computers