AWS Bedrock Architecture Diagram: The Complete Visual Guide (2026)
How to draw an AWS Bedrock architecture diagram. Covers Bedrock Runtime, Knowledge Bases, Agents, Guardrails, and the most common enterprise AI deployment patterns — with prompt templates to generate diagrams in seconds.
An AWS Bedrock architecture diagram shows how an enterprise GenAI application is built on top of Amazon Bedrock — the managed service that provides access to foundation models from Anthropic, Amazon, Meta, Mistral, Cohere, and others through a single unified API. Bedrock has become the default AI infrastructure layer for teams already operating on AWS: it eliminates the need to manage model infrastructure, provides native integrations with S3, Lambda, and IAM, and handles enterprise requirements like VPC endpoints, PrivateLink, CloudTrail logging, and data residency.
Diagramming your Bedrock setup is essential for architecture reviews, cost governance (different models have dramatically different per-token pricing), security audits (data never leaves your AWS account with Bedrock's private API), and for onboarding engineers who need to understand which model handles which workload and how the retrieval and agent layers fit together.
The core components of an AWS Bedrock architecture
Bedrock Runtime API
The Bedrock Runtime is the core inference layer. It exposes two primary operations: InvokeModel for synchronous, single-turn completions and InvokeModelWithResponseStream for streaming output. Every model on Bedrock shares the same API surface — your application code doesn't change when you swap Claude for Llama or Titan. In your diagram, show the Bedrock Runtime as the central API layer that your application services call, with the specific model IDs annotated (e.g., anthropic.claude-opus-4-8-20260801-v1:0 for the highest-capability tasks, anthropic.claude-haiku-4-5-20251001-v1:0 for high-volume classification).
Amazon Bedrock Knowledge Bases
Knowledge Bases for Amazon Bedrock is the managed RAG layer. It handles document ingestion, chunking, embedding (using Amazon Titan Embeddings or Cohere Embed), vector storage (OpenSearch Serverless, Aurora pgvector, or Pinecone), and retrieval — all without you managing the pipeline. When a user query arrives, Knowledge Bases retrieves the top-K relevant chunks from your vector store, injects them into the prompt as context, and invokes the foundation model. In your diagram, show Knowledge Bases as a component that receives user queries from your application, retrieves from the connected vector store, and passes augmented prompts to the Bedrock Runtime. Connect it to your S3 data sources and show the sync schedule for keeping the index current.
Agents for Amazon Bedrock
Agents for Amazon Bedrock implements the ReAct (Reason + Act) loop for agentic workflows. An agent receives a user goal, reasons about what actions to take, calls Lambda functions or Bedrock Knowledge Bases as tools, observes the results, and iterates until it produces a final response. You define the agent's capabilities through an action group — an OpenAPI schema that describes available tools mapped to Lambda function ARNs. Agents handle conversation memory natively through session state. Your diagram should show: the agent receiving user input from your app, the ReAct loop with the foundation model, the action group with each Lambda tool, and any Knowledge Bases the agent can query for context.
Guardrails for Amazon Bedrock
Guardrails applies content filtering, PII detection and redaction, grounding checks (hallucination detection), topic blocking, and word filtering across both input and output. Guardrails sits between your application and the Bedrock Runtime — apply it to every model invocation by passing a guardrailIdentifier in your API call. In your diagram, represent Guardrails as a bidirectional policy layer on the path between your application and the Bedrock Runtime, with annotations for which policies are active (content filter threshold, denied topics, PII types to redact).
Model evaluation and customization
For teams that need custom behavior, Bedrock supports fine-tuning (continued pre-training and instruction fine-tuning on Amazon Nova and Titan models) and model evaluation (automated metrics and human review across a test dataset). The fine-tuned model is stored as a custom model in your account and invoked through the same Runtime API. Your diagram should show custom models as a separate node in the model layer, connected to the training data in S3 and the evaluation job outputs in CloudWatch.
Networking and security
By default, Bedrock API calls traverse the public internet using HTTPS. For compliance-sensitive workloads, use a VPC endpoint (AWS PrivateLink) to route all Bedrock traffic through your VPC without touching the public internet. All Bedrock API calls are logged to CloudTrail, and model invocation logging (optional) writes input/output to CloudWatch Logs and S3. IAM policies control which principals can invoke which models — you can restrict teams to specific model families or deny invocations above a token threshold. Your architecture diagram must show: the VPC boundary if using PrivateLink, the IAM role used by your application services, and the CloudTrail/CloudWatch observability path.
Common AWS Bedrock architecture patterns
Pattern 1: Direct API integration for application teams
The simplest pattern: an application service (Lambda, ECS container, or EC2) calls the Bedrock Runtime directly with an IAM role. No Knowledge Bases, no agents — just raw model inference wrapped in your own prompt engineering logic. Use this for: summarization, classification, translation, code generation, and any task where your application already provides all necessary context in the prompt. Guardrails should always be applied even in this simple pattern.
Pattern 2: Managed RAG with Knowledge Bases
The most common enterprise pattern: a document corpus lives in S3 (PDFs, Word docs, HTML pages, Confluence exports), Knowledge Bases indexes it using Titan Embeddings into an OpenSearch Serverless collection, and your application calls the RetrieveAndGenerate API. No custom vector pipeline to manage, no chunking logic to tune. Best for internal knowledge bases, customer support copilots, and documentation assistants where the corpus updates weekly or less.
Pattern 3: Agentic workflows with Agents for Bedrock
For tasks that require multi-step reasoning and taking actions in external systems: an Agents for Bedrock agent receives a user request, reasons about a plan using Claude or Nova, calls Lambda tools (CRM lookup, inventory check, order creation), and synthesizes a final answer. The agent session handles conversation context automatically. Best for: customer service automation, IT helpdesk workflows, data retrieval from heterogeneous systems, and approval routing.
Pattern 4: Multi-model routing for cost and quality optimization
Large-scale deployments route different request types to different models based on complexity, latency, and cost requirements. A routing Lambda classifies incoming requests — simple queries go to Haiku or Nova Micro (cents per million tokens), complex analysis goes to Opus or Nova Pro (dollars per million tokens). The routing layer logs the classification decision and the final token cost to CloudWatch for FinOps visibility. Model fallback handles quota throttling.
Prompt templates for AWS Bedrock diagrams
Basic Bedrock RAG application
Bedrock Agent for customer service automation
Multi-model cost-optimized architecture
Private Bedrock deployment with VPC endpoint
AWS Bedrock component reference
| Component | What it does | Key diagram annotation |
|---|---|---|
| Bedrock Runtime | Model inference API (InvokeModel / streaming) | Model ID + region |
| Knowledge Bases | Managed RAG: ingest → embed → retrieve → generate | Data source (S3), vector store type, sync schedule |
| Agents for Bedrock | Multi-step ReAct agent with Lambda tool calling | Action group name, Lambda ARNs, session state |
| Guardrails | Content filter, PII redaction, grounding checks | Applied to: input / output / both |
| Model Evaluation | Automated + human evaluation of model outputs | Eval dataset in S3, metric type |
| Custom Models | Fine-tuned Nova/Titan models stored in your account | Base model ID, training data S3 path |
| VPC Endpoint | PrivateLink for Bedrock API — no public internet | Endpoint policy (model allow-list) |
| Invocation Logging | Input/output log to CloudWatch + S3 | Retention policy, destination bucket |
What a good Bedrock architecture diagram must show
- Model selection: Label every model invocation with the specific model ID. Different models have 10–100× cost and capability differences — the model choice is a first-class architectural decision, not an implementation detail.
- IAM trust boundaries: Show which IAM roles have
bedrock:InvokeModelpermission and which model IDs they're restricted to. Overly broad IAM on Bedrock can allow any service to invoke any model, creating unexpected cost exposure. - Guardrails placement: Make it explicit whether Guardrails is applied on the input, the output, or both — and which policies are active. This is the primary safety audit surface.
- Data residency: Show the AWS region of each Bedrock component. Not all models are available in all regions, and data residency requirements determine which region you can use.
- Observability path: Show where CloudTrail logs, model invocation logs, and CloudWatch metrics flow. This is required for cost attribution and compliance audits.
- Knowledge Base sync: Show the data source (S3 bucket), the sync trigger (EventBridge schedule or manual), and the vector store. Stale indexes are a common source of incorrect AI responses.
Frequently asked questions about AWS Bedrock architecture
What is an AWS Bedrock architecture diagram?
An AWS Bedrock architecture diagram is an architecture diagram that shows how an application uses Amazon Bedrock to access foundation models and managed AI capabilities. It depicts the Bedrock Runtime API, Knowledge Bases, Agents, Guardrails, and the AWS services they integrate with (Lambda, S3, OpenSearch, IAM, CloudTrail, VPC endpoints). It is the standard documentation artifact for enterprise teams building GenAI applications on AWS.
What is the difference between Amazon Bedrock and SageMaker for AI architecture?
Amazon Bedrock provides access to pre-trained foundation models through a managed API — no infrastructure to provision or models to deploy. SageMaker is a full MLOps platform for training, deploying, and serving custom models you own. In practice: use Bedrock when you want to build applications on top of existing foundation models (Claude, Nova, Llama, Mistral); use SageMaker when you need to fine-tune models on your own data at scale, run custom training jobs, or serve open-source models that aren't available on Bedrock. Many architectures use both: Bedrock for general-purpose LLM calls, SageMaker for domain-specific models.
How do I diagram AWS Bedrock Agents?
For a Bedrock Agents diagram, show the agent as a reasoning loop: the user sends a goal → the agent invokes the foundation model with the goal and available action descriptions → the model returns a tool call → the agent executes the Lambda tool → the result is fed back to the model → the loop repeats until a final answer is produced. Connect action groups (Lambda functions) and any Knowledge Bases the agent can query as separate nodes. Annotate the IAM role the agent assumes when invoking Lambda. Show session state storage if you're persisting conversation history beyond a single turn.
Related guides: RAG architecture diagrams, AWS architecture diagrams, LLM architecture diagrams, and AI agent architecture diagrams.
Ready to try it yourself?
Start Creating - Free