Multi-agent LLM training platform for police de-escalation. Three-agent pipeline drives scenario simulation, ICAT-aligned grading, and skill-tree progression — with the CJIS-prep security posture baked in from the rules layer up.
Police de-escalation training in the United States is short, infrequent, scenario-poor, and almost never gives an officer measurable, longitudinal feedback on the actual decision-making patterns they exhibit under stress. The Police Executive Research Forum's ICAT curriculum (Integrating Communications, Assessment, and Tactics) exists; the problem is throughput and feedback fidelity. A patrol-level officer cannot run live scenarios on a meaningful cadence.
CopApp.ai is the answer I built for that: a platform where an officer can run ICAT-aligned scenarios any time, get graded against a rubric the agencies actually use, and watch a skill tree fill in (or decay) over time — with the CJIS-prep security posture an agency would need from day one.
Dialog → World → PsycheBehind every scenario, three LLM agents run in series. Dialog generates the actual civilian utterances and reactions to officer behavior. World maintains the scene's physical state — who is where, what they are holding, what is visible, what just changed. Psyche models the non-player characters' inner state: fear, anger, suspicion, intoxication, intent. The three are decoupled on purpose: Dialog stays in role even when World has changed underneath it, and Psyche evolves on its own internal logic so a scenario doesn't snap out of character because a single agent over-corrected.
At the end of every scenario, a grading pass scores the officer on a six-category / 10-point rubric aligned with ICAT principles. A second pass produces a 10-dimension session insight score — what the rubric grade alone misses (situational awareness, escalation triggers, language choices, etc.) — that drives the skill tree.
Officer progress is a graph, not a number. Skills are nodes, sub-skills are leaves, Bronze → Silver → Gold → Diamond tiers gate advanced scenarios. Each skill has a freshness decay on it: a skill you nailed six months ago and haven't practiced is no longer "Diamond." This is the part the standard 8-hours-of-CE-credit model never captures.
Bring high-reliability-organization training rigor to a decision-making domain that doesn't have it yet — and make the CJIS-prep posture a non-negotiable from the rules layer up, not a compliance-team retrofit.
CopApp is not currently a CJIS-certified system — that requires agency procurement and external audit — but every architectural decision was made so it could be one without a rewrite. Concretely:
Around the core training loop sits the rest of what an agency actually needs to adopt the platform:
CopApp is where I learned to build LLM products with rubric integrity. An AI grading system is only as legitimate as the rubric it scores against, and the rubric is only as legitimate as the domain framework underneath. Anchoring the platform to PERF / ICAT — which agencies already recognize — was the difference between "interesting AI demo" and "thing an agency would consider procuring." It also pushed me to take CJIS-prep seriously from the rules layer up, which has paid off again in every law-enforcement-adjacent system I've built since (including PawTrek).