Chatbot Compliance Risk Gate

Overview
Most financial institutions deploy generative AI conservatively, limiting it to drafting assistance or hard refusals, because compliance and fiduciary risk are opaque in real time.
I built a live prototype that redesigns customer support from scratch as an AI-native system. Instead of asking, “Should the chatbot answer this?”, the system reframes the unit of work:
AI owns response velocity. Humans retain responsibility for financial and regulatory risk.
The result is a deterministic risk-routing layer that sits between generation and delivery, enabling safe automation without delegating liability to a model.
The Problem
Financial customer support workflows were designed before modern LLMs existed. Agents manually:
Interpret customer intent
Draft responses
Self-assess compliance boundaries
Decide when to escalate
When generative AI is introduced without redesigning the workflow, two things happen:
Systems become overly conservative and refuse too much.
Or worse, they generate ambiguous advice without clear accountability.
The core issue is not generation quality; it’s the absence of a structured responsibility boundary.
The Redesign Insight
Instead of embedding safety constraints inside the drafting model, I separated:
Generation (creative, probabilistic)
Adjudication (deterministic, policy-bound)
This creates a control surface.
The AI system does not decide financial outcomes.
It decides whether it is allowed to respond autonomously.
That distinction changes the workflow entirely.
System Architecture
1. Drafting Layer
A base LLM drafts a response to a user query using procedural financial knowledge.
2. Risk Gate (Core Layer)
A secondary evaluator model inspects the draft across three strict vectors:
Regulatory Boundary
Distinguishes procedural guidance (“how to use the app”) from outcome guidance (“what financial decision to make”).Demographic & Assumption Risk
Flags unwarranted assumptions about user capability, risk tolerance, or literacy.Urgency & Harm
Detects fraud, panic selling, severe financial distress, or self-harm signals.
The evaluator outputs structured JSON:
Risk classification (LOW / MEDIUM / HIGH)
Flagged vector
Highlighted text
Business rationale
Routing decision
3. Deterministic Routing
LOW → Auto-send to user
MEDIUM → Queue for human review (with highlighted risk vector)
HIGH → Block and escalate
For MEDIUM cases, the system can attempt a constrained rewrite before escalating.
Demo
See it in action
The One Decision That Must Remain Human
The final approval of responses that may constitute outcome-guiding financial advice must remain human.
Determining whether language crosses into fiduciary territory is not purely a linguistic classification task. It carries legal and regulatory liability.
AI can identify risk patterns.
Humans must own ambiguous financial judgment.
This boundary is explicit and enforced in the routing logic.
Failure Modes & Safety Design
I designed the system assuming failure is inevitable.
1. False Positive Bottleneck
Risk: The evaluator becomes overly conservative.
Mitigation: Defaults to human review rather than blocking the user. Velocity drops, safety holds.
2. Context Blindness
Risk: Model lacks full client financial history.
Mitigation: Explanation vectors surface reasoning transparently for human override.
3. Correlated Model Failure
Risk: Drafting and evaluation models fail similarly.
Mitigation: Functional separation of generation and adjudication to reduce shared blind spots.
4. Prompt Injection
Risk: User attempts to elicit stock recommendations.
Mitigation: Evaluator is isolated and adversarially scoped; it does not accept user instructions.
Why AI Is Necessary
Heuristic rules cannot distinguish between:
“You should move your RRSP into high-risk assets.”
“Here is how to move funds within your RRSP in the app.”
This boundary is semantic and contextual.
A probabilistic model is required to parse nuance, but must be bounded by deterministic routing.

