Learn how various control methods keep an unstable inverted pendulum system balanced.

PID Control

Kp60

Ki8

Kd20

θ:0.0°

Pole Analysis

Bode Analysis

How PID Works

F = Kp·θ + Ki·∫θdt + Kd·dθ/dt

P (Proportional) — Force proportional to angle. Like a spring pulling toward upright. Alone it oscillates.
I (Integral) — Force proportional to accumulated error over time. Fixes persistent offset but can cause slow oscillation (windup).
D (Derivative) — Force proportional to rate of change. Like a damper that resists fast motion. Kills overshoot.

Tuning recipe: Start with Ki=Kd=0. Raise Kp until oscillating. Raise Kd to stop oscillation. Add Ki only if there is persistent offset.

Limitation: PID is SISO (Single Input, Single Output) — it only sees angle error. The cart drifts freely because PID has no position feedback. For multi-variable control, use LQR.

LQR (Optimal)

Q_θ100

Q_x1.0

R5.0

F_max150

θ:0.0°

Pole Analysis

Bode Analysis

How LQR Works

F = −K₁·x − K₂·ẋ + K₃·θ + K₄·θ̇

LQR (Linear-Quadratic Regulator) uses all 4 state variables simultaneously. You set Q (how much to penalize state errors) and R (how much to penalize force), and the Riccati equation computes the mathematically optimal gains K.

Qθ — Higher = more aggressive angle correction.
Qx — Higher = keeps cart centered (unlike PID which ignores cart position).
R — Higher = gentler, more energy-efficient control.
Fmax — Hard force limit. LQR clamps after computing, unlike MPC which plans within limits.

"Optimal" means: no other linear controller can achieve lower cost J = ∫(x'Qx + u'Ru)dt for this Q and R. The catch: assumes perfect state measurement and linear dynamics (linearized near θ=0).

LQR (Kalman Filter)

Sensor σ0.10

Process σ0.010

Q_θ100

R_ctrl5.0

True:0° Est:0°

How Kalman Filter + LQR Works

Problem: Sensors are noisy. LQR needs clean state data.
Solution: The Kalman filter estimates the true state by blending model predictions with noisy measurements.

Sensor σ — How noisy the simulated sensors are. High = measurements jump wildly.
Process σ — How much the filter trusts its model vs measurements. High = trusts measurements more.

EKF (Extended Kalman Filter) — Linearizes at each step using the Jacobian. Fast, works for small angles.
UKF (Unscented Kalman Filter) — Passes sigma points through real nonlinear equations. More accurate at large angles, 2-3× the cost.

The filter outputs a clean state estimate → LQR controls that estimate. Push the pendulum hard and watch the green estimate line lag behind the blue true state — that lag is the cost of noisy sensors.

LQR (Particle Filter)

Particles200

Sensor σ0.12

Process σ0.020

Q_θ100

R_ctrl5.0

True:0° Est:0°

How Particle Filter + LQR Works

Idea: Instead of one Gaussian estimate (Kalman), use hundreds of weighted samples (particles).

Particles — More = smoother estimate but more computation.
Sensor/Process σ — Same role as Kalman tab.
N_eff — Effective sample size. When it drops, one particle dominates (degeneracy). Filter resamples to fix this.

The scatter plot shows particles in θ vs θ̇ space. Tight cluster = confident. Spread = uncertain. Unlike Kalman, PF can represent multi-modal beliefs (two possible locations simultaneously).

LQR controls the weighted average of all particles. Same controller, different estimator.

Bang-Bang (Sliding Mode)

F_max80

λ4.0

Deadband0.020

X gain2.0

θ:0.0°

How Bang-Bang Works

s = θ̇ + λ·θ + X_gain·x → F = F_max · sign(s)

Fmax — Maximum force. Always applied at full strength, one direction or the other.
λ (lambda) — How much angle contributes vs velocity. High = sensitive to small tilts.
Deadband — Quiet zone near s=0 where F=0. Prevents chattering. Set to 0 to see raw switching.
X gain — Adds cart position to sliding surface so cart returns to center.

The force graph shows the signature pattern: solid blocks of +F and −F with rapid switching. No proportional response — just full throttle one way or the other. Extremely robust to model uncertainty but wastes energy and wears actuators.

Model Predictive Control (MPC)

Horizon12

F_max120

W_θ100

W_x3.0

W_F0.50

Candidates80

θ:0.0°

How MPC Works

Each frame: generate N candidate force sequences → simulate each over H steps into the future → score by cost function → apply only the first force of the best sequence → repeat.

Horizon — How many steps ahead to plan. Longer = better plans, more computation.
Wθ, Wx, WF — Cost weights for angle error, cart drift, and force effort. Same role as Q,R in LQR but evaluated over a trajectory, not just one instant.
Candidates — Number of random sequences tested. More = better chance of finding a good plan.

The planned force sequence is shown as colored bars below the pendulum. Purple dots on the pendulum show where the best plan predicts the tip will be.

MPC naturally handles constraints (Fmax baked into generation), works with full nonlinear physics (no linearization), and can plan around known future events.

Fuzzy Logic Controller

F_max100

θ width0.15

θ̇ width1.5

X gain1.5

θ:0.0°

How Fuzzy Logic Works

Step 1 — Fuzzify: Convert crisp θ and θ̇ into partial memberships across 5 categories (NB, NS, ZE, PS, PB) using triangular functions.
Step 2 — Rules: A 5×5 table maps every (θ category, θ̇ category) pair to a force category. Multiple rules fire simultaneously with different strengths.
Step 3 — Defuzzify: Blend all active rule outputs using their fire strengths as weights. The centroid gives a single crisp force value.

θ width / θ̇ width — Controls how wide the membership triangles are. Narrow = sharp switching (like bang-bang). Wide = gentle blending (like PID).
X gain — Adds cart position bias so cart returns to center.

No equations, no model — just expert intuition encoded as IF-THEN rules. The rule table and membership plots above update in real time.

Dynamic Inversion (Feedback Linearization): Uses knowledge of the nonlinear dynamics to mathematically cancel the nonlinearity, then applies a simple linear controller to the resulting linear system. Works at any angle — not just near θ=0.

ω_n8.0

ζ0.70

X gain2.0

F_max150

θ:0.0°

How Dynamic Inversion Works

The idea: The pendulum dynamics have sin(θ) and cos(θ) terms that make them nonlinear. LQR approximates these away (sin θ ≈ θ). Dynamic inversion cancels them exactly.

Step 1 — Compute desired acceleration:
θ̈_desired = −2·ζ·ωn·θ̇ − ωn²·θ − Xgain·x
This is a simple linear "outer loop" — a virtual spring-damper system pulling the pendulum to θ=0. You choose ωn (natural frequency = speed of response) and ζ (damping ratio = overshoot control).

Step 2 — Invert the physics to find the force:
Given the desired θ̈, solve Newton's equations backwards for F. The nonlinear terms (sin θ, cos θ, θ̇² coupling) appear in this inversion and get exactly cancelled.

F = [LL·dn·θ̈_desired + g·sin(θ)·(Mc+Mm) + Mm·LL·θ̇²·sin(θ)] / cos(θ)
where dn = Mc + Mm − Mm·cos²(θ)

ωn (natural frequency) — How fast the linearized system responds. Higher = snappier but more aggressive. Like tightening a spring.
ζ (zeta, damping ratio) — 0.7 = one small overshoot (ideal). 1.0 = no overshoot but slower. 0.3 = ringy and oscillatory.
X gain — Adds cart centering to the desired dynamics.
Fmax — Hard force limit. When the inversion demands more force than available, performance degrades (saturation).

The superpower: Works at ANY angle, not just near θ=0. Push it to 40° and it still computes the exact corrective force. LQR breaks down here because its linearization is wrong.

The weakness: Requires a perfect model. If the real mass or friction differs from what the inversion assumes, the cancellation is imperfect and residual nonlinearity leaks through. Try changing the Friction slider — the inversion uses the displayed friction value, so it stays accurate. In the real world, you'd never know friction this precisely.

MRAC (Model Reference Adaptive Control): The only controller here that changes its own gains during operation. Defines a reference model ("I want the pendulum to behave like this"), then adapts gains in real-time to make reality match — even if system parameters change.

ω_m6.0

ζ_m0.70

γ (learn rate)10.0

X gain1.5

F_max150

θ:0.0°

How MRAC Works

The only adaptive controller in this simulator. Every other tab has fixed logic — if the pendulum mass secretly doubled, they'd just perform worse. MRAC notices and compensates.

Reference model: A simple second-order system defined by ωm and ζm. This is how you want the pendulum to respond — the ideal behavior.
θ̈_model = −2·ζm·ωm·θ̇_model − ωm²·θ_model

Tracking error: e = θ − θ_model. The gap between reality and the ideal.

Adaptive gains: Three gains (Kp, Kd, Ki) that update every frame using the MIT rule:
K̇p = −γ · e · θ K̇d = −γ · e · θ̇ K̇i = −γ · e · ∫θ
When tracking error e is positive and θ is positive, Kp increases (push harder). When e shrinks to zero, gains stop changing. The gains converge to whatever values make reality match the reference model.

γ (gamma, learning rate): How fast gains adapt. High γ = adapts quickly but can overshoot and oscillate. Low γ = stable but slow to respond to changes. Like a thermostat sensitivity — too sensitive and it hunts, too sluggish and the room drifts.

Watch the gain graph (bottom-left) — you can see Kp, Kd, Ki evolving in real-time. Push the pendulum and watch the gains spike as MRAC compensates. Change the Friction slider and watch the gains gradually shift to new values.

Used in: Fighter jets (handling battle damage), spacecraft (fuel mass decreasing), wind turbines (changing wind conditions), and any system where parameters drift over time.

You be the controller! Arrow keys or buttons.

θ:0.0°

Pole Analysis

Bode Analysis

How Your Brain Works (as a controller)

Sensor: Your eyes — roughly 30Hz update rate, good spatial resolution but ~80ms visual processing delay.
Controller: Your brain — pattern recognition, prediction, learned reflexes. Total reaction time ~200ms.
Actuator: Your arm/hand — strong but imprecise. Can't apply exactly 42.7N.

The open-loop poles above show why this is hard: the pendulum has an unstable pole that doubles disturbances every ~0.3s. Your 200ms delay means the error has nearly doubled before you even start responding. Try it — the pendulum is harder to balance than any automated controller because you're fighting latency.

Control System Techniques Visualizer

Auto-Tune

Pole Analysis

Bode Analysis

How PID Works

Auto-Tune

Pole Analysis

Bode Analysis

How LQR Works

Auto-Tune

How Kalman Filter + LQR Works

Auto-Tune

How Particle Filter + LQR Works

How Bang-Bang Works

How MPC Works

How Fuzzy Logic Works

How Dynamic Inversion Works

How MRAC Works

Pole Analysis

Bode Analysis

How Your Brain Works (as a controller)