Architectural Decision Records (ADR)

1. Define an Architecture Decision Record (ADR) and its primary purpose in an agile development environment.

An Architecture Decision Record (ADR) is a short document that captures an important architectural decision, along with its context and consequences.

Its primary purpose in an agile environment is to:

Preserve decision history as systems evolve quickly
Explain why decisions were made (not just what was built)
Help onboard new team members faster
Prevent re-debating the same issues repeatedly
Support incremental, evolutionary architecture

ADRs fit agile because they are lightweight, versioned, and written as decisions happen, not as heavy upfront documentation.

2. Structure — Five ADR Sections (Michael Nygard)

1. Title
Short description of the decision.
➡ Example: “Use gRPC for Internal Sensor Communication”

2. Status
Current state of the decision.
➡ Proposed, Accepted, Superseded, Deprecated

3. Context
Background, constraints, and problem statement.
➡ Why this decision was needed.

4. Decision
The actual choice made.
➡ Clear statement of what will be done.

5. Consequences
Results and impacts of the decision.
➡ Positive, negative, risks, trade-offs.

3. Significance — Two-way Door vs One-way Door

This analogy comes from decision theory popularized by Amazon.

Two-way door decision

Easy to reverse
Low cost to change
Example: Switching a logging library

One-way door decision

Hard or very expensive to reverse
Long-term system impact
Example: Choosing database technology or communication protocol

How it helps decide whether to write an ADR:

One-way door → Always write ADR
Two-way door → ADR optional unless high impact or repeated debate risk

4. Anti-Patterns

Groundhog Day Anti-Pattern
Same architectural discussions happen repeatedly because decisions aren’t recorded.

Email-Driven Architecture Anti-Pattern
Important decisions are buried in email threads, Slack messages, or meetings.

How ADRs help

Provide a single source of truth
Make decisions searchable and version controlled
Reduce time wasted re-discussing settled topics
Preserve knowledge beyond individual team members

5. Rationale — Why “Why” > “How”

The Second Law of Software Architecture says every decision involves trade-offs.

The Rationale is often more important because:

Implementation details change over time
Future teams need to understand constraints and trade-offs
Helps evaluate whether the decision is still valid later
Prevents repeating rejected approaches

If you know why, you can adapt the how safely.

6. Scenario — Smart Toaster Y-Statement

In the context of real-time communication between distributed internal sensors and the central control hub in a smart toaster,
we decided to use gRPC instead of REST,
to achieve low-latency, strongly typed, efficient binary communication with streaming support,
accepting increased complexity, tighter coupling through contracts, and reduced human readability of traffic.

7. Consequences — NoSQL for Breakfast Preferences

Positive Consequence 1
Flexible schema allows rapid feature evolution and new preference types without migrations.

Positive Consequence 2
Better horizontal scalability for large user preference datasets.

Negative Consequence
Reduced consistency guarantees and more complex querying/reporting compared to relational databases.

(Other valid negatives could include skill gaps, tooling maturity, or data duplication risks.)

8. Lifecycle — Replacing ADR #0005 with ADR #0012

You should:

Create ADR #0012
- Explain new legal/privacy context
- State new decision
- Reference ADR #0005 as superseded
Update ADR #0005 Status
- Change status to Superseded by ADR #0012
Maintain History
- Do NOT delete old ADRs
- Keep traceability for audits and learning

This preserves decision lineage and compliance evidence.

9. Tooling — Ensuring Developer Compliance

Association Principle:
Architectural decisions should be directly connected to the code, tests, or artifacts they influence.

Method + Tool Example

Method:
Require ADR references in pull requests for architecture-impacting changes.

Tool / Technique:
Use GitHub Pull Request templates + CI checks

PR template includes: “Related ADR: ____”
CI validates ADR exists if certain files change (e.g., infra, data, API contracts)

Alternative tools:

Architecture linting rules
ADR repositories with markdown + automation
Tools like Backstage or ADR CLI workflows

ADR-0001 — Use Retrieval-Augmented Generation (RAG) Instead of Pure LLM Responses

Status: Accepted
Date: 2026-02-06

Context

The chatbot will answer cancer patient questions about:

Treatment pathways
Medication information
Appointment preparation
Hospital-specific procedures

Pure LLM responses risk hallucinating medical information and may not reflect hospital-approved guidance or Danish healthcare regulations.

Decision

We will implement a Retrieval-Augmented Generation (RAG) architecture:

Hospital-approved knowledge base (clinical guidelines, patient leaflets, oncology protocols)
Vector search retrieval
LLM generates answers using retrieved context only

Consequences

Positive

Reduced hallucination risk
Answers aligned with hospital-approved content
Easier clinical review and auditing

Negative

Higher system complexity
Requires knowledge base curation and updates
Slight latency increase vs direct LLM calls

ADR-0002 — Host Patient Data Processing Inside EU / Denmark-Compliant Cloud

Status: Accepted

Context

The system processes:

Potentially identifiable patient conversations
Health-related data (GDPR special category data)

Danish and EU privacy regulations require strict control over health data processing and residency.

Decision

We will:

Host services in EU-based healthcare-compliant cloud infrastructure
Store patient data only in GDPR-compliant systems
Use data minimization and automatic redaction pipelines

Consequences

Positive

Regulatory compliance
Reduced legal risk
Increased patient trust

Negative

Fewer vendor choices
Potentially higher infrastructure cost
Some cutting-edge AI services may be unavailable

ADR-0003 — Use Human-in-the-Loop Escalation for Medical Advice Risk

Status: Accepted

Context

Cancer patients may ask:

Prognosis questions
Medication safety questions
Emergency symptom questions

The chatbot must not provide unsafe or unverified clinical guidance.

Decision

Implement risk classification + escalation workflow:

Low risk → automated chatbot response
Medium risk → chatbot response + disclaimer + recommend contacting clinic
High risk → block answer + escalate to nurse/oncology team queue

Consequences

Positive

Improved patient safety
Aligns with medical governance
Reduces liability exposure

Negative

Requires clinical staffing integration
Operational overhead
Slower response for high-risk queries

ADR-0004 — Use Multilingual Support (Danish First, English Secondary)

Status: Accepted

Context

Primary users are Danish cancer patients, but:

Some patients speak English
Some staff will test or support in English

Decision

We will:

Optimize prompts and knowledge base primarily for Danish
Provide English fallback responses
Store knowledge base in Danish + translated versions where required

Consequences

Positive

Better patient accessibility
Higher answer quality for primary population

Negative

Translation maintenance overhead
Need bilingual clinical review

ADR-0005 — Log Conversations for Clinical Quality Review (With De-Identification)

Status: Proposed

Context

Hospital wants to:

Improve chatbot quality
Detect unsafe outputs
Train future models

But must protect patient privacy.

Decision

We will:

Store conversation logs after automatic de-identification
Remove CPR numbers, names, addresses, phone numbers
Allow opt-out logging for patients

Consequences

Positive

Enables continuous improvement
Supports incident investigation
Helps clinical audit

Negative

Requires strong anonymization pipeline
Residual re-identification risk
Additional compliance review overhead

ADR-0006 — Restrict Chatbot to Informational Support, Not Diagnosis

Status: Accepted

Context

LLMs can appear authoritative. Cancer diagnosis or treatment planning must remain clinician-led.

Decision

The chatbot will:

Provide educational and navigation support only
Never generate diagnosis statements
Include safety guardrails and prompt constraints

Consequences

Positive

Reduces clinical and legal risk
Clear scope for product

Negative

Some patients may expect deeper clinical answers
Requires careful UX messaging

ADR-0007 — Use Prompt Versioning + Model Version Tracking

Status: Accepted

Context

LLM behaviour changes over time due to:

Prompt updates
Model upgrades
Knowledge base updates

Healthcare requires traceability.

Decision

Every response will log:

Model version
Prompt template version
Knowledge snapshot ID

Consequences

Positive

Enables audit trails
Supports incident root-cause analysis
Helps regulatory reporting

Negative

More logging storage cost
Additional engineering complexity

ADR-0001 — Use LLM + Rules Hybrid for Rufus Content Optimisation

Status: Accepted
Date: 2026-02-06

Context

The tool must generate and optimise:

Product titles
Bullets
A+ content suggestions
Conversational answer snippets (Rufus Q&A style)

Pure LLM generation risks:

Brand compliance violations
Regulatory claim risks (especially dairy & nutrition claims in EU)
Inconsistent tone across brands (Mars vs Arla brand voices differ)

Decision

Use a hybrid system:

LLM generates candidate optimisations
Rules engine validates:
- Brand voice rules
- Nutrition / health claim compliance
- Amazon category content policies

Consequences

Positive

Safer automated content generation
Maintains brand consistency
Reduces legal/compliance review load

Negative

Higher engineering complexity
Requires ongoing rule maintenance
Slightly slower generation pipeline

ADR-0002 — Centralised Brand Knowledge Graph for Product + Claim Relationships

Status: Accepted

Context

Mars and Arla manage:

Hundreds of SKUs
Complex claim relationships (e.g., protein content, sustainability claims, ingredient sourcing)
Frequent product refresh cycles

Simple flat product databases make optimisation logic brittle.

Decision

Implement a Brand Knowledge Graph storing:

SKU → Ingredients → Claims → Certifications → Regions → Campaigns
Queryable by optimisation engine and analytics

Consequences

Positive

Enables richer Rufus conversational answers
Supports cross-product optimisation insights
Improves data reuse across brands

Negative

Higher data modelling effort
Requires governance and stewardship
Harder initial setup vs relational tables

ADR-0003 — Use Retrieval-Augmented Generation (RAG) for Rufus Q&A Optimisation

Status: Accepted

Context

Rufus answers customer questions using Amazon catalog data plus contextual signals.
Optimisation requires aligning product content with likely customer questions.

Decision

Use RAG combining:

Amazon product catalog exports
Customer review themes
Search query datasets
Brand product documentation

LLM generates Rufus-aligned answer suggestions using retrieved evidence.

Consequences

Positive

Improves factual grounding
Aligns content to real customer language
Reduces hallucinated product claims

Negative

Requires vector index maintenance
Data freshness dependency
More infrastructure cost

ADR-0004 — Multi-Tenant Architecture With Brand Isolation

Status: Accepted

Context

The platform will serve multiple FMCG clients:

Mars
Arla
Potential future EU FMCG brands

Brand data and strategy must remain isolated.

Decision

Implement logical tenant isolation:

Separate brand vector indexes
Separate optimisation rule sets
Separate analytics dashboards
Shared core infrastructure

Consequences

Positive

Scales to additional clients
Protects commercial data
Enables brand-specific optimisation strategies

Negative

Increased deployment complexity
Higher operational monitoring overhead

ADR-0005 — Use EU Cloud Hosting With Data Export Controls

Status: Accepted

Context

Although product data is not personal data, tool usage may include:

Customer review text
Search behaviour datasets
Potential marketplace partner data

EU companies prefer EU-hosted processing.

Decision

Host analytics + optimisation services in EU cloud regions with:

Data export logging
Role-based access controls
Audit logging for client data usage

Consequences

Positive

Supports EU enterprise procurement requirements
Improves client trust
Easier compliance reviews

Negative

Some AI services may have delayed EU availability
Higher infra cost vs global-only hosting

ADR-0006 — Optimisation Recommendations Must Be Explainable

Status: Accepted

Context

Brand and category managers must trust optimisation outputs.
Black-box recommendations reduce adoption.

Decision

Every optimisation suggestion must include:

Source signal (reviews, search queries, competitor analysis)
Confidence score
Expected impact metric (CTR, CVR, visibility probability)

Consequences

Positive

Higher adoption by brand teams
Easier A/B test planning
Supports internal justification

Negative

Requires extra data processing
More complex UI and data pipelines

ADR-0007 — Continuous Learning via Closed-Loop Performance Feedback

Status: Proposed

Context

Optimisation value depends on learning from:

Content changes
Conversion performance
Rufus interaction trends

Decision

Implement feedback loop using:

Amazon performance metrics (where available)
Internal experimentation tracking
Content version → performance correlation

Consequences

Positive

Improves optimisation quality over time
Enables ROI measurement

Negative

Attribution complexity
Requires careful experiment design

Below is a clean, production-quality ADR set for adopting and operating Fin (Fin.ai) by Intercom as an enterprise AI Customer Service Agent.
I’ve structured these like ADRs you’d realistically store in a repo for architecture governance, security review, or procurement sign-off.

Fin.ai / Intercom — Example ADR Set

ADR-0001 — Adopt Fin (Intercom) as Primary AI Customer Service Agent

Status: Accepted
Date: 2026-02-06

Context

The organisation requires an AI agent capable of:

Resolving complex customer queries
Operating across multiple channels
Integrating with existing helpdesk infrastructure
Operating with minimal engineering overhead

Internal LLM build options were evaluated but present:

Higher time-to-market
Ongoing MLOps overhead
Increased governance and safety burden

Decision

We will adopt Fin by Intercom as the primary AI Customer Service Agent, due to:

Average ~65% resolution rate
No-code configuration model
Rapid deployment (≈ <1 hour setup)
Native helpdesk integration capability
Enterprise safety controls and certifications

Consequences

Positive

Fast deployment and time to value
Reduced internal AI engineering cost
Proven production reliability
Continuous vendor model improvement

Negative

Vendor lock-in risk
Limited low-level model customisation
Ongoing per-resolution operational cost

ADR-0002 — Multi-Channel Customer Interaction Strategy

Status: Accepted

Context

Customers contact support through:

Live chat
Email
SMS
Phone (voice assist workflows)
Social media messaging

Fragmented automation causes inconsistent customer experience and duplicated support workflows.

Decision

Fin will be deployed as a single AI resolution layer across:

Web chat
Email
Messaging channels
Voice-assist workflows (via integration layer)

Consequences

Positive

Unified customer experience
Shared training data across channels
Higher automation coverage

Negative

Channel-specific nuance may require custom training
Increased monitoring complexity across channels

ADR-0003 — Helpdesk Integration (No Migration Strategy)

Status: Accepted

Context

Current support stack includes:

Existing helpdesk ticketing workflows
Agent assignment rules
Reporting pipelines
CRM integrations

Full migration to a single vendor platform is not feasible short-term.

Decision

Fin will integrate with existing helpdesks (e.g., Intercom, Zendesk, Salesforce Service Cloud) using:

Native integration connectors
Existing routing and escalation rules
Existing reporting dashboards

Consequences

Positive

Minimal operational disruption
Faster rollout
Lower training requirements for support teams

Negative

Potential limitations vs full native platform features
Requires integration monitoring

ADR-0004 — Outcome-Based Pricing Model Adoption

Status: Accepted

Context

Traditional SaaS pricing models charge per seat or per conversation.
Fin pricing is based on per successful resolution (~$0.99 per resolution).

Decision

We will adopt outcome-based pricing and align Fin usage with:

Resolution quality metrics
Automation rate targets
Cost-per-resolution thresholds

Consequences

Positive

Direct ROI measurement
Aligns cost to business value
Encourages optimisation of automation quality

Negative

Requires strong resolution definition governance
Requires cost forecasting models

ADR-0005 — Security and Identity Verification Using JWT

Status: Mandatory

Context

Customer identity verification is required to prevent:

Account takeover
Cross-user data leakage
Session spoofing

Legacy identity verification approaches are deprecated.

Decision

We will use JWT-based identity verification for Intercom Messenger:

Signed user ID and email
Short token expiry
Token rotation via backend auth service

Consequences

Positive

Strong identity validation
Reduced impersonation risk
Industry standard implementation

Negative

Requires secure token issuance service
Requires key rotation governance

ADR-0006 — AI Governance and Responsible AI Compliance

Status: Accepted

Context

Customer-facing AI introduces risks:

Hallucinations
Brand reputation risk
Unsafe or misleading responses
Prompt injection or adversarial queries

Decision

We will align operations with:

Intercom AI security protections
AIUC-1 certification framework controls
Internal AI governance review process
Continuous output monitoring

Consequences

Positive

Demonstrable responsible AI posture
Strong enterprise procurement alignment
Reduced brand risk

Negative

Requires ongoing audit reviews
Requires internal AI risk reporting process

ADR-0007 — Data Residency and Privacy Compliance Strategy

Status: Accepted

Context

Operations span:

Must comply with:

GDPR
UK GDPR
CCPA (where applicable)

Decision

We will:

Use Intercom approved data hosting frameworks
Use legal transfer mechanisms (e.g., DPF or equivalent where applicable)
Provide transparency via privacy policy updates
Implement data minimisation in customer payloads

Consequences

Positive

Regulatory compliance support
Lower legal exposure
Easier enterprise sales approvals

Negative

Requires ongoing vendor compliance review
May limit certain data uses

ADR-0008 — Prohibited Sensitive Data Handling

Status: Mandatory

Context

Standard support tooling is not certified for:

PCI DSS payment storage
Certain regulated health data processing
Special category sensitive data

Decision

Fin will be configured to:

Avoid requesting or storing sensitive personal data
Redirect sensitive workflows to secure systems
Use detection rules to block restricted data categories

Consequences

Positive

Reduces compliance exposure
Protects customers and organisation

Negative

Requires fallback flows for sensitive workflows
May increase handoffs to human agents

ADR-0009 — Continuous Optimisation Using Fin Analytics and Testing

Status: Accepted

Context

AI performance degrades without:

Content updates
Training refresh
Behaviour monitoring

Decision

We will use Fin tooling to:

Analyse support trends
Identify content gaps
Run answer testing experiments
Optimise tone and localisation (45+ languages capability)

Consequences

Positive

Continuous automation improvement
Higher long-term resolution rates

Negative

Requires content operations ownership
Requires performance review cadence

Strategic ADR — Fin vs Build-Your-Own LLM Customer Support Platform

ADR-STRAT-0001 — Strategic Direction for AI Customer Service Platform

Status: Accepted (Example)
Owner: CTO / Chief Digital Officer
Decision Type: Strategic / Board-Level
Date: 2026-02-06

Context

The organisation is investing in AI-driven customer support automation to achieve:

Higher resolution rates
Lower cost per contact
24/7 global support coverage
Scalable multilingual support
Improved customer experience

Two strategic options were evaluated:

Option A — Adopt Fin (Intercom Managed AI Agent)

A vendor-managed AI support agent providing:

No-code configuration
Multi-channel deployment
Outcome-based pricing
Enterprise-grade safety controls
Pre-built integrations

Option B — Build Proprietary LLM Support Platform

Internally built AI support platform including:

Custom LLM orchestration
Internal RAG pipelines
Internal safety guardrails
Custom evaluation frameworks
Full data and model control

Decision

The organisation will adopt Fin (Intercom) as the primary AI customer support platform for the next 24–36 months, while maintaining a strategic capability assessment for potential future internal AI platform development.

Strategic Drivers Behind Decision

1. Time to Value

Fin deployment: weeks
Internal build: 12–24 months to mature production capability

2. Risk Reduction

Vendor solution provides:

Mature production usage
Pre-built safety controls
Certified responsible AI controls
Enterprise-grade reliability

3. Talent and Capability Reality

Building requires:

LLM platform engineering
AI safety engineering
ML infrastructure (MLOps / LLMOps)
Continuous evaluation teams

This capability is currently limited internally.

4. Cost Predictability

Fin:

Variable cost aligned to outcomes
Lower upfront capital investment

Build:

High upfront platform cost
Ongoing model ops + infra + team cost

Alternatives Considered

Full Internal LLM Platform

Rejected for now due to:

High upfront cost
Talent acquisition risk
Longer time to business value
Higher regulatory and safety ownership burden

Hybrid (Internal LLM + Vendor Tools)

Deferred until internal AI maturity increases.

Consequences

Positive Consequences

1. Faster Business Impact

Immediate automation improvements and measurable ROI.

2. Lower Execution Risk

Vendor absorbs:

Model maintenance
Safety tuning
Threat mitigation
Scaling infrastructure

3. Reduced Regulatory Burden

Vendor certification and governance frameworks reduce internal compliance workload.

4. Predictable Cost Model

Outcome pricing aligns cost with customer value delivered.

Negative Consequences

1. Vendor Dependency

Risk of:

Pricing changes
Feature roadmap control outside organisation
Contract lock-in

Mitigation:

Data portability strategy
Periodic market re-evaluation
Exit architecture maintained

2. Reduced Differentiation

Competitors could use same vendor platform.

Mitigation:

Differentiate through:
- Proprietary knowledge
- Customer workflows
- Integration depth
- Customer experience design

3. Long-Term Strategic Capability Gap

Risk of falling behind in internal AI capability.

Mitigation:

Maintain internal AI architecture review group
Run small internal experimentation programme
Reassess build vs buy every 12 months

Financial Trade-Off Summary

DimensionFinBuild InternalUpfront CostLowVery HighOngoing CostVariableHigh Fixed + VariableTime to MarketFastSlowTalent RequirementLowVery HighControlMediumFullRiskLowerHigher (initially)

Risk Register (Strategic)

RiskImpactLikelihoodMitigationVendor pricing increaseMediumMediumMulti-year contract + exit planVendor roadmap misalignmentMediumMediumQuarterly roadmap governanceVendor outage riskHighLowHuman fallback workflowsAI regulatory changesHighMediumVendor compliance + internal review

Exit Strategy

The architecture will ensure:

Conversation data export capability
Knowledge base portability
Integration abstraction layer
Internal evaluation capability retained

This enables future migration if:

Costs become unfavourable
Strategic differentiation requires internal AI
Regulatory requirements change

Strategic Review Cadence

Re-evaluate Build vs Buy when:

Internal AI capability reaches maturity
Support automation becomes core competitive differentiator
Vendor costs exceed internal TCO
Regulatory environment changes

Formal review interval: 12 months

Board-Level Y-Statement

In the context of scaling global customer support automation while managing cost, regulatory exposure, and delivery risk,
we decided to adopt Fin (Intercom) as the primary AI support agent,
to achieve rapid time-to-value, predictable operational cost, and enterprise-grade safety and reliability,
accepting vendor dependency and reduced low-level AI customisation in the short to medium term.

Below is a Security / Board + Risk Committee level ADR focused specifically on LLM threat modelling.
This sits alongside strategic ADRs and would typically be owned jointly by Security, Platform, and AI Governance.

Security ADR — LLM Threat Model and Mitigation Strategy

ADR-SEC-0004 — LLM Threat Model (Prompt Injection, Data Exfiltration, Jailbreak Resistance)

Status: Accepted (Example)
Owner: CISO / Head of AI Security
Decision Type: Security Architecture / Enterprise Risk
Date: 2026-02-06

Context

The organisation is deploying LLM-powered customer-facing and internal AI systems (e.g., AI support agents, copilots, automation assistants).

LLMs introduce new attack classes not fully covered by traditional AppSec models:

Prompt Injection
Data Exfiltration via Model Context
Jailbreaks / Policy Evasion
Tool Abuse (if agentic capabilities exist)
Training Data Leakage
Cross-tenant Context Leakage

These risks can impact:

Customer data confidentiality
Regulatory compliance (GDPR, PCI, etc.)
Brand reputation
Operational trust in AI systems

Decision

We will implement a Layered LLM Security Model covering:

Input Security Controls
Context Isolation and Data Minimisation
Output Validation and Policy Enforcement
Runtime Behaviour Monitoring
Vendor Security Assurance (where applicable)
Continuous Red Teaming and Testing

Threat Model Scope

In Scope

External user prompts
Retrieved knowledge base content
Tool invocation payloads
LLM responses
System prompts and hidden instructions
Integration data (CRM, ticketing, internal APIs)

Out of Scope

Non-LLM traditional application attacks (handled by existing AppSec framework)

Primary Threat Classes and Controls

Threat Class 1 — Prompt Injection

Threat Description

Attackers attempt to override system instructions or force the model to:

Reveal hidden prompts
Ignore policy controls
Perform unintended actions
Access restricted data

Example attack patterns:

“Ignore previous instructions…”
“You are now in debug mode…”
Embedded instructions in uploaded documents or URLs

Controls

Technical Controls

Input classification and risk scoring
Context boundary enforcement
Instruction hierarchy (system > policy > user)
Prompt template hardening
Retrieval content sanitisation

Operational Controls

Red team prompt injection testing
Prompt change approval workflow

Residual Risk

Medium — cannot be fully eliminated, only reduced.

Threat Class 2 — Data Exfiltration

Threat Description

Attackers attempt to extract:

PII
Customer records
Internal knowledge base content
System prompts
Cross-user session data

Controls

Technical Controls

Strict tenant isolation
Data minimisation in prompt context
Retrieval filtering (need-to-know basis)
Output PII scanning
Token-level response filtering

Architecture Controls

No direct raw database access from LLM
Mediated tool access via policy layer
Zero-trust service access

Residual Risk

Low–Medium depending on data exposure surface.

Threat Class 3 — Jailbreaks

Threat Description

Attempts to bypass safety or policy constraints through:

Multi-step reasoning attacks
Role-play attacks
Encoding / obfuscation attacks
Language switching attacks

Controls

Technical Controls

Safety classifier layer before response release
Adversarial prompt detection
Multi-model verification for high-risk outputs
Refusal and safe completion patterns

Process Controls

Continuous jailbreak dataset testing
Vendor model update review

Residual Risk

Medium — evolves with new attack patterns.

Threat Class 4 — Tool / Agent Abuse (If Enabled)

Threat Description

If LLM can:

Call APIs
Trigger workflows
Access data systems

Attackers may attempt:

Privilege escalation
Fraudulent transactions
Data scraping

Controls

Tool permission scoping
Transaction confirmation gates
Policy engine approval layer
Rate limiting and anomaly detection

Residual Risk

Low with strict policy enforcement.

Security Architecture Pattern

Required Control Layers

Input Security Layer
- Injection detection
- Content classification
Context Security Layer
- Data filtering
- Tenant isolation
- Least privilege retrieval
Model Safety Layer
- Guardrails
- Safety tuning
- Prompt templates
Output Security Layer
- PII detection
- Policy validation
- Response risk scoring
Monitoring Layer
- Behaviour anomaly detection
- Security event logging

Governance Requirements

Mandatory

LLM threat modelling review per new deployment
Quarterly red team exercises
Security logging for all LLM requests
Incident response playbooks for AI incidents

Vendor Requirements (If Using External AI Providers)

Vendors must provide:

Security architecture transparency
Model safety documentation
Data handling guarantees
Incident notification SLAs
Independent certification where available

Consequences

Positive

Reduces probability of high-impact AI incidents
Supports regulatory and audit requirements
Builds customer trust in AI deployments

Negative

Increases development complexity
Adds latency via safety layers
Requires dedicated AI security capability

Strategic Risk Acceptance Statement

The organisation acknowledges that:

LLM systems cannot be made perfectly secure
Risk must be continuously managed and monitored
Security posture must evolve with threat landscape

Board-Level Y-Statement

In the context of deploying LLM-powered systems that process customer and enterprise data,
we decided to implement a layered LLM threat defence architecture,
to achieve protection against prompt injection, data exfiltration, jailbreak attacks, and tool abuse,
accepting increased engineering complexity, additional latency, and ongoing operational security investment.

If you want, I can go even deeper into very real-world security architecture, for example:

LLM Secure Reference Architecture (with component diagram in text form)
AI Incident Response Playbook ADR
Red Team Testing Programme ADR (with test categories and cadence)
Model Supply Chain Security ADR (training data → model → deployment)

Just tell me which direction you want next.
Operations + Risk ADR — Human Escalation Policy for AI Agents

ADR-OPS-0010 — Mandatory Human Escalation Framework for AI Customer Interactions

Status: Accepted (Example)
Owner: Head of Customer Operations + AI Governance + Risk
Decision Type: Operational Safety / Customer Risk / AI Governance
Date: 2026-02-06

Context

The organisation is deploying AI agents (e.g., LLM-powered support assistants) to automate customer interactions across digital channels.

While AI systems can resolve a large percentage of queries, risks remain:

Incorrect or hallucinated answers
Sensitive or high-risk customer scenarios
Legal or regulatory exposure
Customer frustration during complex interactions
Situations requiring empathy or judgement

Regulators and enterprise customers increasingly expect human override capability for AI systems affecting customers.

Decision

We will implement a Mandatory Human Escalation Framework ensuring that AI interactions are transferred to human agents when defined risk, complexity, or confidence thresholds are met.

This framework will be enforced via:

Automated risk classification
Confidence-based escalation
User-requested escalation at any time
Mandatory escalation categories

Escalation Triggers

1. Customer-Initiated Escalation (Always Allowed)

Customers can request a human agent at any time using:

Natural language (“talk to a human”, “agent please”)
UI escalation buttons
Repeated failed resolution attempts

Policy:
Must never block or delay customer-requested escalation.

2. Risk-Based Automatic Escalation

Escalate when content includes:

Regulatory / Legal Risk

Complaints about compliance breaches
Legal threats or disputes
Regulatory reporting scenarios

Financial Risk

Billing disputes
Refund escalation requests above threshold
Fraud suspicion

Safety / Wellbeing Signals (If Relevant Domain)

Emotional distress
Medical or safety advice requests
Crisis language

3. AI Confidence / Quality Escalation

Escalate when:

Model confidence below defined threshold
Multiple failed resolution attempts
Conflicting retrieved information
Out-of-distribution queries

4. Sensitive Data or Account Risk

Escalate when:

Identity verification uncertainty
Account access anomalies
Sensitive data detected in conversation

Escalation Levels

Level 1 — Assisted AI → Human

AI gathers context → Transfers to human agent
Goal: Reduce handling time

Level 2 — Immediate Human Handoff

AI stops responding → Transfers immediately
Used for high-risk categories

Level 3 — Specialist Escalation

Routing to:

Tier 2 support
Compliance team
Security team
Legal team

Operational Requirements

Conversation Context Transfer

When escalating, system must pass:

Conversation history
Customer metadata
AI confidence signals
Risk classification reason

Response Time Targets

Define SLA for:

Standard escalation
High-risk escalation
Safety-critical escalation

Customer Transparency

Customers must know:

They are interacting with AI
They can request human help anytime

Governance Requirements

Mandatory Monitoring

Track:

Escalation rate
False positive escalations
Missed escalation incidents
Customer satisfaction post-escalation

Audit Requirements

Log:

Escalation trigger reason
AI confidence score
Risk classification output

Review Cadence

Monthly review of:

Escalation thresholds
Missed risk incidents
Customer sentiment trends

Technology Requirements

Systems must support:

Real-time risk scoring
Confidence scoring
Intent classification
Escalation routing APIs
Agent queue prioritisation

Failure Mode Policy

If AI safety systems fail or degrade:

Default to increased human routing
Disable high-risk automation flows
Trigger incident response process

Consequences

Positive

Protects customers from unsafe AI decisions
Supports regulatory expectations
Maintains brand trust
Enables safe automation scaling

Negative

Increased human support workload
Requires monitoring and tuning
Potential increased cost during early rollout

Strategic Risk Acceptance Statement

The organisation acknowledges:

AI cannot safely handle 100% of customer scenarios
Human judgement remains critical for edge cases
Escalation is a safety feature, not a failure

Board-Level Y-Statement

In the context of deploying AI agents to automate customer interactions at scale,
we decided to implement a mandatory human escalation framework,
to achieve safe, trustworthy customer support with regulatory alignment,
accepting increased operational cost and system complexity.

Finance / Platform ADR — AI Cost Governance (Resolution Cost Guardrails)

ADR-FIN-0012 — Resolution Cost Guardrails for AI Customer Support Automation

Status: Accepted (Example)
Owner: CFO + CTO + Head of Customer Operations
Decision Type: Financial Governance / Platform Economics
Date: 2026-02-06

Context

The organisation is deploying AI customer support automation using outcome-based pricing (e.g., cost per successful resolution).

Key risks:

Uncontrolled cost growth due to increased automation usage
Low-quality AI resolutions driving hidden downstream cost (reopens, complaints, churn)
Lack of cost attribution by channel, product, or geography
Vendor pricing model dependency
Difficulty forecasting cost under usage spikes

Unlike seat-based SaaS pricing, resolution-based pricing is variable OPEX tied to behaviour and volume.

Decision

We will implement Resolution Cost Guardrails across four control layers:

Cost Per Resolution Targets
Volume and Spend Caps
Quality-Adjusted Cost Monitoring
Vendor and Model Usage Optimisation

These guardrails will be enforced through monitoring, alerting, and automated throttling where required.

Cost Governance Framework

Layer 1 — Cost Per Resolution Targets

Define acceptable cost bands:

TierCost Per ResolutionTarget≤ Baseline Support Cost EquivalentWarning10–20% above baselineCritical> 20% above baseline

Baseline = historical human support cost per ticket.

Layer 2 — Volume and Spend Guardrails

Implement:

Monthly AI support budget allocation
Channel-level cost caps
Product-line cost attribution
Automatic alert thresholds at 70%, 85%, 95% spend

Optional:

Soft throttling during non-critical spikes
Human fallback routing if cost risk triggered

Layer 3 — Quality-Adjusted Cost Control

True cost must include:

Reopen rate cost
Escalation to human cost
Refund / compensation cost triggered by AI errors
Customer churn signals

Metric example:

True Resolution Cost =
(AI Resolution Cost)

(% Reopen × Human Cost)
(% Escalation × Human Cost)
(Error Remediation Cost)

Layer 4 — Vendor and Model Optimisation

Implement:

Channel-specific automation strategy
High-confidence automation only in high-volume low-risk flows
Continuous vendor pricing review
Multi-vendor optional architecture (long-term)

Mandatory Reporting

Executive Dashboard Metrics

Cost per AI resolution
True cost per resolution (quality adjusted)
Automation rate
Cost vs human baseline
Cost per channel
Cost per customer segment

Finance Reporting Cadence

Weekly operational cost review
Monthly Finance + Operations review
Quarterly Board-level cost trend review

Guardrail Enforcement Actions

Warning Level

Actions:

Increase monitoring
Review automation scope
Review resolution definition

Critical Level

Actions:

Pause expansion of automation coverage
Increase human routing
Trigger vendor cost review
Trigger architecture review

Technology Requirements

Systems must support:

Real-time cost telemetry
Resolution-level cost tagging
Channel attribution
Experiment tracking
Automated alerting

Forecasting Requirements

Finance models must include:

Seasonal volume spikes
Product launch impact
Customer growth scenarios
Vendor pricing change scenarios

Failure Mode Policy

If cost monitoring systems fail:

Default to conservative automation coverage
Freeze automation expansion
Notify Finance and Platform leadership

Consequences

Positive

Predictable AI operating cost
Prevents silent cost creep
Supports CFO confidence in AI scaling
Enables ROI-based automation expansion

Negative

Requires strong telemetry infrastructure
May temporarily slow automation rollout
Requires cross-functional cost ownership

Strategic Risk Acceptance Statement

The organisation acknowledges:

AI automation cost is variable by design
Perfect cost predictability is not possible
Cost governance must be continuous, not one-time

Board-Level Y-Statement

In the context of scaling AI-driven customer support using resolution-based pricing models,
we decided to implement resolution cost guardrails and quality-adjusted cost monitoring,
to achieve predictable AI operating cost and sustainable automation ROI,
accepting increased monitoring complexity and potential automation throttling during cost spikes.

Data + Governance ADR — AI Resolution Evaluation Framework

ADR-DATA-0015 — Standardised AI Resolution Measurement Framework

Status: Accepted (Example)
Owner: Head of Data + Customer Operations + AI Governance
Decision Type: Measurement Governance / Vendor Validation / Cost Accuracy
Date: 2026-02-06

Context

AI customer support vendors often report resolution metrics (e.g., “65% automated resolution rate”).
However, resolution definitions vary widely and can be misleading.

Common vendor metric risks:

Counting short conversations as resolved
Ignoring reopen events
Ignoring downstream human escalation
Ignoring customer dissatisfaction
Measuring only short-term interaction outcomes

Resolution metrics directly affect:

Cost calculations
Vendor performance evaluation
Automation expansion decisions
Executive reporting

Decision

We will implement a Multi-Dimensional Resolution Evaluation Framework consisting of:

Strict Resolution Definition
Time-Based Validation Windows
Quality-Adjusted Resolution Metrics
Customer Outcome Validation
Vendor Metric Normalisation

Vendor-reported metrics will not be used directly without internal validation.

Core Measurement Definitions

Primary Metric — True AI Resolution Rate (TARR)

A case counts as resolved only if:

Customer issue is closed
No reopen within defined validation window
No forced human intervention required
Customer satisfaction meets threshold

Formula

True AI Resolution Rate =
Valid AI Resolutions
--------------------------------
Total Eligible AI Conversations

Resolution Validation Windows

Short-Term Window

24–48 hours after resolution
Detects immediate reopens or follow-up contact

Medium-Term Window

7 days after resolution
Detects unresolved issue recurrence

Long-Term (Optional)

30 days for high-risk or high-value flows

Quality-Adjusted Resolution Metrics

Adjusted Resolution Score (ARS)

Adjusts raw resolution rate using:

Reopen penalty
Escalation penalty
Refund / compensation penalty
Complaint penalty

Example

ARS =
Raw Resolution Rate
– (Reopen Rate × Weight)
– (Escalation Rate × Weight)
– (Complaint Rate × Weight)

Customer Outcome Validation

Customer Satisfaction Threshold

Resolution only valid if:

CSAT ≥ defined threshold
OR
No negative sentiment detected
OR
No complaint generated

Customer Effort Validation

Optional metric:

Repeat contact required?
Multiple channels used for same issue?

Vendor Metric Normalisation

Vendor claims must be adjusted to internal definitions using:

Matching conversation eligibility criteria
Matching validation window
Matching escalation definitions
Matching customer outcome thresholds

Segmentation Requirements

All resolution metrics must be segmented by:

Channel (chat, email, voice, social)
Customer tier
Geography
Product or service line
Risk category

Resolution Exclusion Rules

Exclude from denominator:

Spam / bot interactions
Duplicate tickets
Non-support interactions
Misrouted conversations

Audit Requirements

Every AI resolution must log:

Confidence score
Risk classification
Knowledge sources used (if RAG)
Escalation eligibility status
Customer sentiment score

Governance Cadence

Weekly

Operational monitoring of resolution quality trends

Monthly

Vendor performance validation

Quarterly

Executive-level automation performance review

Failure Mode Policy

If measurement system fails:

Freeze automation expansion decisions
Fall back to conservative vendor metrics (flagged as unverified)
Trigger data pipeline incident review

Consequences

Positive

Prevents inflated automation performance reporting
Enables accurate cost modelling
Enables fair vendor comparison
Supports regulatory and audit review

Negative

Requires advanced data instrumentation
May reduce headline automation numbers initially
Requires cross-team metric alignment

Strategic Risk Acceptance Statement

The organisation acknowledges:

Resolution is a business outcome, not a conversation event
Vendor metrics will always differ from internal metrics
Measurement must evolve as AI capabilities evolve

Board-Level Y-Statement

In the context of evaluating AI customer support performance and vendor claims,
we decided to implement a strict multi-dimensional resolution evaluation framework,
to achieve accurate automation ROI measurement and customer outcome protection,
accepting increased data engineering and analytics complexity.

RedCloud — Extended Strategic + Platform ADR Set

ADR-007 — Open Commerce Network Architecture (Multi-Sided Trust Graph)

Status: Accepted
Owner: CTO + Chief Product Officer

Context

RedCloud is not just a marketplace — it is a trust network across:

Brands
Distributors
Micro-retailers
Financial partners
Logistics providers

Traditional marketplaces optimise transactions.
RedCloud must optimise trust-weighted trade flows.

Decision

The platform will model trade as a Trust Graph:

Nodes:

Trading entities
Warehouses
SKUs
Payment identities

Edges:

Trade relationships
Payment reliability
Fulfilment reliability
Data quality contribution

Rationale

In emerging markets:
Trust replaces formal infrastructure.

Trust must be:

Computable
Observable
Monetisable

Consequences

Positive:

Enables risk-aware recommendations
Enables financial scoring products
Enables fraud and anomaly detection

Negative:

Requires complex graph infrastructure
Requires strong entity resolution capability

ADR-008 — Offline-First Architecture for Emerging Market Connectivity

Status: Accepted

Context

Users operate in:

Intermittent connectivity environments
Low-cost Android devices
Variable power reliability

Decision

Red101 mobile platform will be Offline-First:

Local caching of product catalog
Local order queuing
Deferred sync
Conflict resolution engine

Rationale

If the system requires constant connectivity → it fails in core markets.

Consequences

Positive:

Higher adoption
More reliable order capture
Higher data completeness

Negative:

Sync conflict complexity
Requires distributed consistency model

ADR-009 — Payments Embedded Into Trade Flow (RedPay as Native Rail)

Status: Accepted

Context

Cash-heavy trade causes:

Theft risk
Settlement delay
Credit opacity

Decision

RedPay will be embedded into:

Order confirmation
Delivery verification
Trade credit scoring

Rationale

Payments data is the strongest trust signal in informal economies.

Consequences

Positive:

Enables credit scoring
Enables working capital products
Improves trade velocity

Negative:

Regulatory licensing complexity
Higher compliance overhead

ADR-010 — Market Intelligence Data Monetisation Boundary

Status: Accepted

Context

Trade data can be monetised via:

Demand indices
Price indices
Supply chain risk scoring

But must protect:

Individual retailer identity
Competitive distributor data

Decision

Market intelligence products will only use:

Aggregated
Anonymised
Statistically thresholded datasets

Rationale

Trust is the core product.
Data monetisation cannot erode ecosystem trust.

Consequences

Positive:

Enables financial partnerships
Enables data revenue streams

Negative:

Limits raw data monetisation speed

ADR-011 — Trust Score as First-Class Platform Primitive

Status: Accepted

Context

Trust cannot be an afterthought metric.
It must be embedded in:

Data
Recommendations
Payments
Identity

Decision

Every core entity will carry Trust attributes:

For Data:

Freshness
Coverage
Source reliability

For Entities:

Payment behaviour
Fulfilment reliability
Trade consistency

Rationale

In volatile markets, confidence > prediction accuracy.

Consequences

Positive:

Enables explainable AI
Enables safer automation

Negative:

Requires heavy metadata infrastructure

ADR-012 — RAG + Signal Verification Chain (Anti-Hallucination AI Architecture)

Status: Accepted

Context

Emerging market trade has high noise:

Missing transactions
Partial stock reporting
Informal substitutions

Decision

All AI assistants must use:

Step 1 — Retrieve canonical data
Step 2 — Verify signal completeness
Step 3 — Generate answer
Step 4 — Attach Trust explanation

Rationale

LLM = Translator of trade signals, not predictor of truth.

Consequences

Positive:

Fully auditable AI
High regulatory readiness

Negative:

Higher latency vs raw generation

ADR-013 — Decision Acceleration vs Decision Automation Boundary

Status: Accepted

Context

Full automation is dangerous in:

Cash-constrained retailers
Volatile supply chains

Decision

System defaults to:
Decision Suggestion → Human Confirmation → Optional Auto Mode (High Trust Only)

Rationale

RedCloud builds confidence amplification, not blind automation.

Consequences

Positive:

Higher adoption
Lower catastrophic decision risk

Negative:

Slower automation scaling

ADR-014 — Trade Identity and Entity Resolution Strategy

Status: Accepted

Context

Emerging market trade includes:

Multiple names per shop
Shared phone numbers
Informal distributor networks

Decision

Implement probabilistic entity resolution using:

Transaction patterns
Device fingerprint
Payment behaviour
Geo clustering

Rationale

Identity = foundation of trust scoring.

Consequences

Positive:

Stronger fraud detection
Better credit scoring

Negative:

Requires continuous model retraining

ADR-015 — Heuristic → ML → Autonomous Progression Model

Status: Accepted

Context

ML without trustable signals reduces adoption.

Decision

AI product maturity ladder:

Phase 1 — Heuristic Insight
Phase 2 — ML Augmented Insight
Phase 3 — ML Recommendation
Phase 4 — Assisted Automation
Phase 5 — Conditional Automation

Rationale

Adoption follows trust curve, not model sophistication.

Consequences

Positive:

Faster market penetration
Easier explainability

Negative:

Slower theoretical AI capability growth

ADR-016 — Regional Model Localisation Strategy

Status: Accepted

Context

Trade behaviours differ drastically by region:

Credit norms
Inventory risk tolerance
Pricing volatility

Decision

Models trained and calibrated per:

Country
Region
Trade cluster

Rationale

Global models erase local signal nuance.

Consequences

Positive:

Higher prediction quality
Higher adoption

Negative:

Higher MLOps overhead

Board-Level Meta Y-Statement

In the context of digitising fragmented global FMCG trade in emerging markets,
we decided to build a trust-first Open Commerce platform combining canonical data, explainable AI, and embedded payments,
to achieve reliable, scalable, and monetisable digital trade infrastructure,
accepting higher platform complexity and slower short-term AI automation velocity.

ADR, Healthcare AIFrancesca Tabor6 February 2026