Costate Equation

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 30 Experts worldwide ranked by ideXlab platform

Henri P. Gavin - One of the best experts on this subject based on the ideXlab platform.

  • Approximate solutions to nonlinearly-constrained optimal control problems
    Proceedings of the 2011 American Control Conference, 2011
    Co-Authors: Philip S. Harvey, Henri P. Gavin
    Abstract:

    The use of variational methods in optimal control problems involves solving a two-point boundary-value problem (for states and Costates) and satisfying an optimality condition. For problems with quadratic integral cost that have linear state dynamics and unconstrained controls, the co-state Equations are also linear. Adjoining control constraints to the objective function introduces non linearity to the Costate Equation, and iterative numerical methods are required to converge upon the optimal control trajectory. The nonlinear Costate terms arise at times in which the control constraints are active. In the numerical methodology proposed in this paper, an approximately optimal solution is converged upon from a feasible sub optimal initial control trajectory. In each iteration the control trajectory moves toward the unconstrained optimum solution while remaining feasible. Importantly, the state and Costate Equations are linear and the method is applied to a multi input system designed to minimize the response of a vibration isolation system by adjusting only the damping characteristics of a variable damping device.

Philip S. Harvey - One of the best experts on this subject based on the ideXlab platform.

  • Approximate solutions to nonlinearly-constrained optimal control problems
    Proceedings of the 2011 American Control Conference, 2011
    Co-Authors: Philip S. Harvey, Henri P. Gavin
    Abstract:

    The use of variational methods in optimal control problems involves solving a two-point boundary-value problem (for states and Costates) and satisfying an optimality condition. For problems with quadratic integral cost that have linear state dynamics and unconstrained controls, the co-state Equations are also linear. Adjoining control constraints to the objective function introduces non linearity to the Costate Equation, and iterative numerical methods are required to converge upon the optimal control trajectory. The nonlinear Costate terms arise at times in which the control constraints are active. In the numerical methodology proposed in this paper, an approximately optimal solution is converged upon from a feasible sub optimal initial control trajectory. In each iteration the control trajectory moves toward the unconstrained optimum solution while remaining feasible. Importantly, the state and Costate Equations are linear and the method is applied to a multi input system designed to minimize the response of a vibration isolation system by adjusting only the damping characteristics of a variable damping device.

Liu San-yang - One of the best experts on this subject based on the ideXlab platform.

Douglas Tweed - One of the best experts on this subject based on the ideXlab platform.

  • Lagrange policy gradient.
    arXiv: Learning, 2017
    Co-Authors: Bita Behrouzi, Douglas Tweed
    Abstract:

    Most algorithms for reinforcement learning work by estimating action-value functions. Here we present a method that uses Lagrange multipliers, the Costate Equation, and multilayer neural networks to compute policy gradients. We show that this method can find solutions to time-optimal control problems, driving linear mechanical systems quickly to a target configuration. On these tasks its performance is comparable to that of deep deterministic policy gradient, a recent action-value method.

  • Costate-focused models for reinforcement learning
    arXiv: Learning, 2017
    Co-Authors: Bita Behrouzi, Douglas Tweed
    Abstract:

    Many recent algorithms for reinforcement learning are model-free and founded on the Bellman Equation. Here we present a method founded on the Costate Equation and models of the state dynamics. We use the Costate -- the gradient of cost with respect to state -- to improve the policy and also to "focus" the model, training it to detect and mimic those features of the environment that are most relevant to its task. We show that this method can handle difficult time-optimal control problems, driving deterministic or stochastic mechanical systems quickly to a target. On these tasks it works well compared to deep deterministic policy gradient, a recent Bellman method. And because it creates a model, the Costate method can also learn from mental practice.

Li Bingjie - One of the best experts on this subject based on the ideXlab platform.