All Projects
Multi-Agent LLM · Law Enforcement Training CJIS-Prep · Officer Pilot

CopApp.ai

Multi-agent LLM training platform for police de-escalation. Three-agent pipeline drives scenario simulation, ICAT-aligned grading, and skill-tree progression — with the CJIS-prep security posture baked in from the rules layer up.

The Problem

Police de-escalation training in the United States is short, infrequent, scenario-poor, and almost never gives an officer measurable, longitudinal feedback on the actual decision-making patterns they exhibit under stress. The Police Executive Research Forum's ICAT curriculum (Integrating Communications, Assessment, and Tactics) exists; the problem is throughput and feedback fidelity. A patrol-level officer cannot run live scenarios on a meaningful cadence.

CopApp.ai is the answer I built for that: a platform where an officer can run ICAT-aligned scenarios any time, get graded against a rubric the agencies actually use, and watch a skill tree fill in (or decay) over time — with the CJIS-prep security posture an agency would need from day one.

What I Built

Three-agent pipeline: Dialog → World → Psyche

Behind every scenario, three LLM agents run in series. Dialog generates the actual civilian utterances and reactions to officer behavior. World maintains the scene's physical state — who is where, what they are holding, what is visible, what just changed. Psyche models the non-player characters' inner state: fear, anger, suspicion, intoxication, intent. The three are decoupled on purpose: Dialog stays in role even when World has changed underneath it, and Psyche evolves on its own internal logic so a scenario doesn't snap out of character because a single agent over-corrected.

Automated ICAT-aligned grading

At the end of every scenario, a grading pass scores the officer on a six-category / 10-point rubric aligned with ICAT principles. A second pass produces a 10-dimension session insight score — what the rubric grade alone misses (situational awareness, escalation triggers, language choices, etc.) — that drives the skill tree.

Skill tree + certification tiers with freshness decay

Officer progress is a graph, not a number. Skills are nodes, sub-skills are leaves, Bronze → Silver → Gold → Diamond tiers gate advanced scenarios. Each skill has a freshness decay on it: a skill you nailed six months ago and haven't practiced is no longer "Diamond." This is the part the standard 8-hours-of-CE-credit model never captures.

Five specialized training UIs

  • Reps — quick repeated scenario drills.
  • Hidden Secret — scenarios where the NPC has information the officer must surface through communication, not interrogation.
  • Action-Detraction — designed to teach when not to escalate; the rubric punishes premature force language.
  • Trainer — for an instructor running a class; live scenario authoring, rubric overrides, group grading.
  • Tactician / Dojo — sandbox modes for self-directed practice.
Core Principle

Bring high-reliability-organization training rigor to a decision-making domain that doesn't have it yet — and make the CJIS-prep posture a non-negotiable from the rules layer up, not a compliance-team retrofit.

CJIS-Prep Posture

CopApp is not currently a CJIS-certified system — that requires agency procurement and external audit — but every architectural decision was made so it could be one without a rewrite. Concretely:

  • 30-minute inactivity timeout on all sessions, enforced server-side, not client-side.
  • Auth-event audit logging — every sign-in, sign-out, MFA challenge, and privilege change is written to an append-only audit log.
  • CJI upload disclaimers on any feature where an officer might attach case-related material; the disclaimer is mandatory acknowledgement, logged with the upload.
  • Hardened Firestore security rules with explicit privilege-escalation checks at every multi-tenant boundary. An officer in one department cannot read another department's data even if a bug puts the wrong document path in front of them.
  • OIDC + PKCE with JWKS signature validation, refresh-token rotation with reuse detection, and secure cookie sessions (HttpOnly, Secure, SameSite).

Production Footprint

60
Commits
14k+
Lines of TypeScript
54
Cloud functions
22
Routes
5
Training UIs

Beyond the Training Loop

Around the core training loop sits the rest of what an agency actually needs to adopt the platform:

  • Department management with per-seat assignments, supervisor roles, and rubric overrides per agency policy.
  • ZIP-based community discovery (Mapbox) so an officer can browse adjacent departments to compare training cadences and outcomes.
  • Stripe billing: Solo $9 / Department $29-per-seat. Full subscription lifecycle including failed-payment dunning, mid-cycle proration, and a webhook-driven idempotent provisioning loop.
  • Per-tenant token-bucket rate limiting with hot-key sharding; heavy LLM / media jobs queued to Pub/Sub with circuit breakers and jitter retries.

What I Learned

CopApp is where I learned to build LLM products with rubric integrity. An AI grading system is only as legitimate as the rubric it scores against, and the rubric is only as legitimate as the domain framework underneath. Anchoring the platform to PERF / ICAT — which agencies already recognize — was the difference between "interesting AI demo" and "thing an agency would consider procuring." It also pushed me to take CJIS-prep seriously from the rules layer up, which has paid off again in every law-enforcement-adjacent system I've built since (including PawTrek).