RL Prescribed Performance · Fault-Tolerant Control

RL Prescribed Performance · Fault-Tolerant Control#

UNDER REVIEW · IEEE TCYBE co-author with G. Narayanan et al. · KNU

RL Prescribed Performance · FTC

A reinforcement-learning-based neuro-optimal control scheme for robot manipulators under composite actuator faults. Guarantees prescribed, predefined-time tracking via a filtered performance function and an actor–critic–identifier framework — robust to total/partial loss of effectiveness and abrupt joint faults.

IEEE TCYBE · UNDER REVIEW RL + CONTROL FAULT TOLERANT
Reinforcement Learning Neuro-Optimal Control Fault Tolerance Prescribed Performance Function Predefined-Time Tracking Hamilton–Jacobi–Bellman Actor–Critic–Identifier 2-Link Manipulator

TL;DR#

Robotic manipulators in long-term, high-precision operation are vulnerable to actuator faults — total loss of effectiveness (TLOE) and partial loss of effectiveness (PLOE). This paper develops an RL-based prescribed-performance neuro-optimal fault-tolerant controller that:

  • Guarantees predefined-time tracking independent of initial conditions

  • Compensates abrupt joint dynamics shifts via a robust mechanism

  • Minimizes Hamilton–Jacobi–Bellman objectives using an actor–critic–identifier RL framework

  • Validated on a 2-link manipulator under comprehensive comparative studies

Problem#

Robotic manipulators under unknown nonlinear dynamics, time-varying disturbances, and joint-level actuator/system faults face three combined challenges:

  • Reliability — long-term operation degrades actuator behavior

  • Robustness — PLOE and TLOE silently erode tracking

  • Precision — high-precision tasks have no slack for fault-induced errors

Most existing fault-tolerant control (FTC) schemes either depend on initial-value-dependent settling times or lack explicit uncertainty handling for composite faults.

Key Contributions#

  1. Composite-fault model — the control law explicitly considers actuator TLOE + PLOE and compensates for abrupt joint dynamics shifts.

  2. PPF with a filtered variable — a prescribed performance function with an additional filtered variable enables predefined-time tracking. An error transformation converts the constrained tracking problem into an unconstrained one. Unlike prior work, both the PPF initial condition and transformation parameter are independent of the initial tracking error.

  3. Reliable control mechanism — reduces actuator-fault impact while compensating for neural-network approximation errors. HJB-associated objective functions are minimized through an RL-based identifier–critic–actor framework.

  4. Empirical validation — simulations on an actual two-link manipulator model demonstrate the superiority of the proposed strategy over existing baselines.

Method#

The control scheme combines four key pieces:

  • Neuro-optimal control law — a neural-network parameterized policy that approximates the optimal value function via HJB optimization.

  • Prescribed Performance Function (PPF) — bounds tracking error inside user-defined envelopes throughout the entire trajectory.

  • Filtered variable — decouples the PPF design from the initial tracking error, enabling true predefined-time settling.

  • Actor–Critic–Identifier (ACI) RL — three NNs trained jointly: an actor producing the control action, a critic estimating value-function residuals, and an identifier learning the unknown system dynamics + faults online.

Results#

Validated on a 2-link manipulator under multiple fault scenarios:

  • Normal operation — matches/beats baselines while learning the dynamics online

  • PLOE faults — maintains tracking within prescribed bounds

  • TLOE faults — recovers gracefully, no catastrophic drift

  • Composite fault sequences — robust to transitions between fault modes

My Contribution#

Co-author with G. Narayanan, Sangtae Ahn, and Sangmoon Lee at the School of Electronic & Electrical Engineering, Kyungpook National University. I contributed to the RL-FTC architecture design, simulation infrastructure on the 2-link manipulator model, comparative study setup, and result analysis across normal / PLOE / TLOE / composite-fault scenarios.