All Projects
Multi-Agent Simulation · Propulsion Validated · Monte Carlo Campaign

T-MINUS: Propulsive Landing Protocol

Real-time multi-agent mission-control simulator orchestrating 8 LLM-driven console operators under a Flight Director command loop — with first-principles propulsive landing physics underneath.

The Problem

Propulsive landing — a reusable rocket coming down on its tail, optionally onto an actively-station-keeping droneship — collapses six classically-decoupled engineering problems into one tightly-coupled real-time loop: guidance, control, structures, combustion stability, weather / sea-state, thermal margins, recovery hardware, and comms-latency to the platform. A modern flight team handles this by partitioning across console operators (Propulsion, GNC, Structures, Weather, Thermal, Recovery, Inspection, ASDS) under a Flight Director. The interesting question is not "can we simulate the rocket" — that's solved physics. It's can we simulate the operators: distributed agents reasoning under partial information, surfacing anomalies, coordinating timing, and converging on a HOLD / RESUME / SCRUB / GO-NO-GO call.

I built T-MINUS to find out how far an LLM-orchestrated mission-control loop can be pushed against first-principles physics — and whether the conversations the agents produce are useful artifacts, not theatre.

What I Built

Eight LLM-driven console operators, each with their own state, telemetry feed, decision scope, and customer-tuned personality, run concurrently under a Flight Director command loop. The Flight Director can issue HOLD / RESUME / SCRUB / GO-NO-GO at any cadence; the operators speak when their console state changes or when a monitor escalates a condition. Underneath them, a deterministic physics-and-systems substrate runs the actual flight:

  • G-FOLD lossless-convex landing guidance for the powered descent — convex relaxation of the minimum-fuel / minimum-error landing problem so a solver can hit it deterministically.
  • Combustion-instability modeling with chugging (CHUG), first-longitudinal (L1), and first-tangential (T1) modes — chamber dynamics that drive injector-coking thermal margins and turbine bearing-vibration spectra.
  • Grid-fin aeroelastic flutter on Al-Li 2195 skin under transonic loading, with structural margins that the Structures console actually reports against.
  • JONSWAP irregular sea-state droneship dynamics — the ASDS is not a static target; it heaves and rolls under a realistic sea spectrum, and Recovery / GNC must coordinate against it.
  • Emergent failure cascades across IMU, GPS, RCS, and radar-altimeter parts — failures are not scripted, they arise from the part models under load.
Core Principle

"The monitor decides WHEN, the LLM decides WHAT and HOW."

That sentence is the load-bearing architectural decision. The monitor — a small, deterministic state machine watching telemetry — is responsible for when a console agent gets invoked, what subset of state it sees, and what prompt frame it receives. The LLM is responsible only for the what and how of the response: anomaly framing, recommendation phrasing, escalation choice. This separation keeps timing deterministic, keeps LLM cost bounded, and keeps the agents from hallucinating themselves into the wrong moment of the flight.

Validation: 8-Flight Monte Carlo Campaign

T-MINUS was validated through a campaign of eight full flights across five distinct mission profiles, with customer-specific operator personalities (e.g., a NASA crew mission's Propulsion console runs more conservatively than a Starlink stack's). Total LLM calls across the campaign: 448, average grade A− against an internal rubric scoring anomaly detection accuracy, escalation timeliness, and call-quality under HOLD / SCRUB pressure.

8
LLM-driven console operators
8
Monte Carlo flights
448
LLM calls validated
A−
Average campaign grade
5
Mission profiles

Mission Profiles

Each profile drives different operator personalities, different go-criteria, and different acceptable risk envelopes:

ProfileDriving Constraint
STARLINKHigh-cadence commodity launches; tolerate marginal aborts in favor of throughput.
NASA_CREWHuman-rated; conservative scrub bias; emphasis on margin and traceability.
USSF_CLASSIFIEDRestricted disclosure paths; operator commentary stays inside need-to-know.
COMMERCIAL_GEOHigh-energy trajectory; tighter thermal / structural margins on ascent.
UNIVERSITY_CUBESATLow-cost; primary launches; operator persona is leaner, more pedagogical.

Selected Technical Detail

Console operator pipeline

Each operator receives (a) a narrow telemetry slice routed by the monitor, (b) a rolling state digest of its own console (open items, pending acknowledgements, flag history), and (c) a customer-specific persona shaping its phrasing and risk tolerance. Its output is a typed call (e.g., NOMINAL, WATCH, HOLD-RECOMMEND, NO-GO) plus a natural-language justification that lands on the Flight Director's panel.

Failure-cascade model

Hardware failures are emergent, not scripted. Each modeled part has a degradation process — IMU bias walk, GPS-receiver thermal drift, RCS thruster duty-cycle wear, radar-altimeter sea-spike clutter — and the failures interact (e.g., RCS wear shows up first as an attitude-rate error that the IMU's noise floor partially masks). This is the part the operator agents actually have to diagnose; the monitor only knows there's a fault somewhere downstream.

Flight Director command loop

The Flight Director is the human (or, in unattended runs, a separate scripted operator) that issues HOLD / RESUME / SCRUB / GO-NO-GO. The command loop is the only mechanism that can override an operator's recommendation. This keeps the agents subordinate — they recommend, the FD decides — which is the right shape of authority for a mission-control architecture.

What I Learned

T-MINUS is the project where I most clearly saw where LLMs fit in a high-reliability loop and where they don't. They are excellent at framing — taking a telemetry slice and producing a high-signal, situation-aware sentence. They are bad at owning timing or numerical thresholds. The monitor / LLM split that fell out of the build is the same pattern I'd argue for in any AI-augmented mission-control or production-systems context: deterministic infrastructure decides when and what state; the model decides language and recommendation. Keeping that line clean is how you get an A− campaign instead of a confidently-wrong one.