Action Policy

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 5316 Experts worldwide ranked by ideXlab platform

Qingyang Chen - One of the best experts on this subject based on the ideXlab platform.

  • Self-Learning Cruise Control Using Kernel-Based Least Squares Policy Iteration
    IEEE Transactions on Control Systems Technology, 2014
    Co-Authors: Jian Wang, Xin Xu, Qingyang Chen
    Abstract:

    This paper presents a novel learning-based cruise controller for autonomous land vehicles (ALVs) with unknown dynamics and external disturbances. The learning controller consists of a time-varying proportional-integral (PI) module and an actor-critic learning control module with kernel machines. The learning objective for the cruise control is to make the vehicle's longitudinal velocity follow a smoothed spline-based speed profile with the smallest possible errors. The parameters in the PI module are adaptively tuned based on the vehicle's state and the Action Policy of the learning control module. Based on the state transition data of the vehicle controlled by various initial policies, the Action Policy of the learning control module is optimized by kernel-based least squares Policy iteration (KLSPI) in an offline way. The effectiveness of the proposed controller was tested on an ALV platform during long-distance driving in urban traffic and autonomous driving on off-road terrain. The experimental results of the cruise control show that the learning control method can realize data-driven controller design and optimization based on KLSPI and that the controller's performance is adaptive to different road conditions.

Dusit Niyato - One of the best experts on this subject based on the ideXlab platform.

  • Cognitive Radio Network Throughput Maximization with Deep Reinforcement Learning
    2019 IEEE 90th Vehicular Technology Conference (VTC2019-Fall), 2019
    Co-Authors: Yang Zhang, Dusit Niyato
    Abstract:

    Radio Frequency powered Cognitive Radio Networks (RF-CRN) are likely to be the eyes and ears of upcoming modern networks such as Internet of Things (IoT), requiring increased decentralization and autonomous operation. To be considered autonomous, the RF-powered network entities need to make decisions locally to maximize the network throughput under the uncertainty of any network environment. However, in complex and large-scale networks, the state and Action spaces are usually large, and existing Tabular Reinforcement Learning technique is unable to find the optimal state- Action Policy quickly. In this paper, deep reinforcement learning is proposed to overcome the mentioned shortcomings and allow a wireless gateway to derive an optimal Policy to maximize network throughput. When benchmarked against advanced DQN techniques, our proposed DQN configuration offers performance speedup of up to 1.8× with good overall performance.

Jian Wang - One of the best experts on this subject based on the ideXlab platform.

  • Self-Learning Cruise Control Using Kernel-Based Least Squares Policy Iteration
    IEEE Transactions on Control Systems Technology, 2014
    Co-Authors: Jian Wang, Xin Xu, Qingyang Chen
    Abstract:

    This paper presents a novel learning-based cruise controller for autonomous land vehicles (ALVs) with unknown dynamics and external disturbances. The learning controller consists of a time-varying proportional-integral (PI) module and an actor-critic learning control module with kernel machines. The learning objective for the cruise control is to make the vehicle's longitudinal velocity follow a smoothed spline-based speed profile with the smallest possible errors. The parameters in the PI module are adaptively tuned based on the vehicle's state and the Action Policy of the learning control module. Based on the state transition data of the vehicle controlled by various initial policies, the Action Policy of the learning control module is optimized by kernel-based least squares Policy iteration (KLSPI) in an offline way. The effectiveness of the proposed controller was tested on an ALV platform during long-distance driving in urban traffic and autonomous driving on off-road terrain. The experimental results of the cruise control show that the learning control method can realize data-driven controller design and optimization based on KLSPI and that the controller's performance is adaptive to different road conditions.

Qian Zhang - One of the best experts on this subject based on the ideXlab platform.

  • Reducing Electricity Cost of Smart Appliances via Energy Buffering Framework in Smart Grid
    IEEE Transactions on Parallel and Distributed Systems, 2012
    Co-Authors: Tao Jiang, Qian Zhang
    Abstract:

    To reduce the long term electricity cost of smart appliances (SAs) with deferrable operation time in smart grid, we propose a novel energy buffering framework to intelligently schedule the distributed energy storage (DES) for the cost reduction of SAs in this paper. The proposed energy buffering framework determines the Action Policy (e.g., charging or discharging) and the power allocation Policy of the DES to provide DES power to proper SAs at proper time with lower price than that of the utility grid, resulting in the reduction of the long term financial cost of SAs. Specifically, we first formulate the optimal decision problem in the energy buffering framework as a discounted cost Markov decision process (MDP) over infinite-horizon. Then, we propose an optimal scheme for the energy buffering framework to solve the discounted cost MDP based on online learning approach. Extensive simulation results show that the proposed optimal scheme for the energy buffering framework can significantly reduce the long term financial cost comparing with the baseline schemes and the myopic scheme.

Paulo M. Engel - One of the best experts on this subject based on the ideXlab platform.

  • Using the GTSOM network for mobile robot navigation with reinforcement learning
    2009 International Joint Conference on Neural Networks, 2009
    Co-Authors: Mauricio Menegaz, Paulo M. Engel
    Abstract:

    This paper describes a model for an autonomous robotic agent that is capable of mapping its environment, creating a state representation and learning how to execute simple tasks using this representation. The multi-level architecture developed is composed of 3 parts. The execution level is responsible for interAction with the environment. The clustering level, which maps the input received from sensor space into a compact representation, was implemented using a growing self-organizing neural network combined with a grid map. Finally, the planning level uses the Q-learning algorithm to learn the Action Policy needed to achieve the goal. The model was implemented in software and tested in an experiment that consists in finding the path in a maze. Results show that it can divide the state space in a meaningful and efficient way and learn how to execute the given task.