Underactuated Robotics MIT 6.832

Course Info

deriving equations of motion for 3 "canonically underactuated systems" (acrobot, cart-pole, quadroter)
can we linearize the nonlinear dynamics at a fixed point and use LQR?
controllability: A control system is called controllable if it is possible to construct an unconstrained input signal which will move the system from any initial state to any final state in a finite interval of time
underactuated does not mean non-controllable!
eigenvalue analysis of the linearization for the pendulum
phase portrait of the linear system approximates the true phase portrait extremely well (locally)
demo: simulations using LQR on the cart-pole and the quadrotor; stabilizes the system very well
half of the problem still remains: how do we get the system to the area around the fixed point if we don't start there?

Good overview of linear systems and eigenvalue analysis:

energy shaping controller for the simple pendulum
controller is attractive but not stable
intuition: E will be less than E_desired until it gets to the upright fixed point, u will be proportional to theta_dot. This means it will swing to one side until it reaches theta_dot = 0, then will pump energy through the control torque in the direction of increasing theta_dot as it swings down.
after it reaches the fixed point, if it is perturbed it will swing all the way down and around the other side to get back to the fixed point. This is not the desired behaviour of a stable system, so we need LQR to balance when we're in the vicinity of the fixed point
use collocated partial feedback linearization for the cart-pole to simplify the dynamics. Then, design an energy shaping controller to stabilize the pendulum, with added PD terms to stabilize the cart itself (still switch to LQR at the top)

how do we prove stability for nonlinear systems in general? Up to now we've just been looking at the 2D phase plots. We need a more rigorous method.
example with the damped pendulum: looking at the terms for energy and its derivative, it is clear that energy will never increase. When the derivative is zero, only the fixed points make up an invariant set. So as time goes to infinity, the system will come to rest at a fixed point.
Lyapunov's Direct Method: come up with a function V(x) that satisfies certain properties to prove stability
intuition for delta/epsilon definition of stability in the sense of Lyapunov: choose a sufficiently small delta so that the sublevel-set of V(x) for the largest value that V(x) takes in the delta ball is completely contained in the epsilon ball
LaSalle's theorem: lets us prove asymptotic stability even for systems where V(x) is only negative-semidefinite (like the pendulum)
compare Lyapunov definition with HJB equation: HJB requires solving a complicated PDE, Lyapunov just requires satisfying an inequality (much easier). Also, if we have an optimal cost-to-go function, that also serves as a Lyapunov function if the cost function is positive-semidefinite

how do we verify that the Lyapunov conditions hold for all X?
and how would we even find a Lyapunov function in the first place for an arbitrary nonlinear system?
idea: just sample a lot of points
hard to generalize; let's try optimization instead
for linear systems, can specify problem using semi-definite programming (SDP), although it's overkill for since we can just check eigenvalues for linear systems
we can extend this idea from quadratic functions using SDP to polynomial functions using sum-of-squares optimization (SOS). This will let us work with more general nonlinear systems
important to note that there is a gap between the SOS polynomials and the positive polynomials, but this is still a very useful approach in practice
this approach is actually quite recent (~20 years)

we've seen how to find Lyapunov functions using SOS optimization for polynomials, but what about the other nonlinear systems we've looked at? Aren't they non polynomial?
it turns out that pretty much all rigid body dynamics are polynomial (exception helical joints)
in practice, proving global stability is often not possible (or even desired)
using the S-procedure (similar to Lagrange multipliers), we can express a different SOS problem to prove stability for a region
we can abstract it to search for the largest sublevel set instead of just verifying one
we can also parameterize the dynamics to verify a minimum region of attraction of a range of uncertainty (robustness)
if we use the cost-to-go function, J(x), from LQR, we can now find the region that the linearization is stable for the nonlinear system. We would then know exactly when to switch from the energy shaping controller to the LQR controller for the pendulum, for example.

Dynamic programming and LQR both aim to find a global "policy" for all state space. If we restrict ourselves to searching for a single trajectory from an initial condition we can make things easier
Direct transcription: treat x(t) and u(t) both as decision variables, subject to constraints on our dynamics (and possibly other constraints)
Direct shooting: only treat u(t) as a decision variable. Fewer variables, but potentially worse numerically
For linear systems these will be convex optimization problems. For nonlinear they are nonconvex.
For direct transcription, we don't actually need to integrate. Direct collocation allows us to get a solution with fewer function evaluations.
Direct collocation: treat x(t) and u(t) as decision variables again, but make u(t) a piecewise function of first order polynomials, and 3rd order for x(t). Then the only constraint is that the derivative at the midpoint of the cubic splines matches the dynamics
Nonlinear programming solvers: SNOPT, IPOPT, https://alphaville.github.io/optimization-engine/, https://jump.dev/