AI Engineering Patterns

Use this guide to identify which patterns are most relevant to your current situation. Start with the question that best describes your primary concern.

I need to reduce costs

How are you currently calling LLMs?

Direct API calls with no intermediary → Start with LLM Gateway Pattern
Single provider, single model for everything → Start with Model Router Pattern
Seeing many repeated or similar queries → Start with Semantic Caching
Long context windows eating your budget → Start with Token Budget Pattern
Most queries are simple but you pay for maximum context on every request → Start with Cascading Context Assembly

I need to improve reliability

What kind of failures are you seeing?

Provider outages causing downtime → Start with Circuit Breaker for LLMs
Quality degradation after prompt changes → Start with Prompt Canary Deployment and LLM-as-Judge
Silent quality drops nobody notices → Start with Embedding Drift Detector and Span-Level Tracing

I need better retrieval / RAG

What does your knowledge base look like?

Complex relational data, multi-hop questions → GraphRAG Pattern
Need both keyword and semantic matching → Hybrid Search Pattern
Retrieval returns stale or outdated content → Retrieval Freshness Watermark
Retrieval returns redundant/duplicate chunks → Semantic Deduplication
Upstream data keeps breaking your pipeline → Data Contract Pattern

I need to handle security and compliance

What is your primary concern?

Prompt injection and jailbreak attempts → Input Sanitization Pattern
Agents calling external tools that return untrusted data → Tool Output Firewall
Need standardized model documentation for compliance → Model Card Pattern

I need observability

What can you not see today?

What is happening inside multi-step chains → Span-Level Tracing Pattern
Whether retrieval quality is silently degrading → Embedding Drift Detector

I am building agent systems

Agents combine multiple patterns. A typical production agent stack includes:

LLM Gateway Pattern for routing and observability
Input Sanitization for the front door
Tool Output Firewall for the side door (tool results re-entering context)
Circuit Breaker for LLMs for provider resilience
Token Budget Pattern for runaway prevention
Span-Level Tracing for debugging multi-step flows

I need graph-based intelligence

What are you trying to achieve with graphs?

Multi-hop questions that connect information across documents → Start with GraphRAG
Non-linear reasoning with branching and merging approaches → Start with Graph of Thoughts
Deduplicating entities across multiple data sources → Start with Entity Resolution Graph
Building a knowledge graph for your domain → Start with GraphRAG for the retrieval layer and Entity Resolution Graph for clean entity data

I need to evaluate and test AI quality

What is your primary evaluation challenge?

Need automated quality scoring at scale → Start with LLM-as-Judge
Quality is degrading after prompt or model changes → Combine LLM-as-Judge with Prompt Canary Deployment
Need a quality gate before deploying prompt changes → Use Prompt Canary Deployment with LLM-as-Judge as the scoring mechanism

Decision Guide