AI Coding Agent Architecture: Claude Code, Cursor, and GitHub Copilot Explained (2026)

How AI coding agents work under the hood — the tool loop, context management, and system architecture behind Claude Code, Cursor, and GitHub Copilot Agent. With AI prompt templates for diagramming each.

Ryan·Senior AI Engineer

·Last updated June 12, 2026

AI coding agents have moved from demo to default workflow in 2026. Claude Code, Cursor Agent, and GitHub Copilot Agent are now used daily by millions of engineers — yet most users treat them as black boxes. Understanding the architecture underneath helps you write better prompts, debug unexpected behavior, and design your own coding agents when you need something custom.

All three tools share a core loop: an LLM that reads context, decides which tool to call, receives the tool's output, and iterates until the task is complete or a stop condition fires. The differences lie in how context is assembled, which tools are exposed, where state is stored, and how the human stays in the loop.

The universal AI coding agent loop

Every coding agent — regardless of vendor — runs a variant of the same agentic loop:

Context assembly: The agent reads the task description, relevant files, recent conversation history, and tool definitions into the LLM context window.
LLM reasoning: The model decides the next action — write a file, run a shell command, search the codebase, read a file, ask the user a clarifying question, or declare the task complete.
Tool execution: The chosen tool runs and returns its output (file contents, command stdout, search results, etc.).
State update: Tool output is appended to the context. The model re-evaluates with the new information.
Loop or stop: Steps 2–4 repeat until the model emits a final response or a stop signal — task complete, user approval needed, or error limit reached.

Claude Code architecture

Claude Code is a terminal-native agentic coding tool built by Anthropic. It runs directly in your shell and uses the Claude API (claude-opus-4-8 or claude-sonnet-4-6) as its reasoning engine.

Core components

Shell process: Claude Code runs as a Node.js process in your terminal. It does not require an IDE — it works in any shell environment, including CI containers and remote servers.
Tool layer: Claude Code exposes a filesystem tool (read, write, edit files), a bash tool (arbitrary shell commands), a web search tool, and a browser tool for reading URLs. Tools are implemented as native Node.js functions that call the Claude API with tool_use blocks.
MCP client: Claude Code is a first-class MCP (Model Context Protocol) client. It can connect to any MCP server — GitHub MCP, filesystem MCP, database MCP, custom internal servers — and expose those servers' tools to Claude automatically. This makes it extensible without code changes.
Permission model: Each tool call can require explicit user approval. Destructive operations (rm, overwriting files, git push) trigger an interactive approval prompt by default. Permissions can be pre-approved in .claude/settings.json.
CLAUDE.md context injection: When Claude Code starts, it reads CLAUDE.md files from the repo root and parent directories, injecting them as system context. This is how teams encode project conventions, codebase structure, and deployment procedures into the agent.
Sub-agent spawning: For large or parallelizable tasks, Claude Code can spawn sub-agents — each a separate Claude API call with its own context and tool access — and coordinate their results. This is the Task tool exposed to Claude.

Diagram prompt: Claude Code architecture

"Claude Code agentic coding architecture. A developer runs Claude Code in a terminal shell. The Claude Code Node.js process sends requests to the Anthropic API (claude-opus-4-8 or claude-sonnet-4-6). Claude responds with tool_use blocks. The tool layer executes: (1) Filesystem tool — reads and writes files in the local repo; (2) Bash tool — executes shell commands, runs tests, runs builds; (3) Web fetch tool — reads URLs and searches the web; (4) Task tool — spawns sub-agents with their own Claude API calls. Claude Code also acts as an MCP client, connecting to MCP servers (GitHub MCP, Supabase MCP, custom internal servers) and exposing those tools to Claude. CLAUDE.md files are read at startup and injected as system context. Tool call results flow back into the context window. A permission gate prompts the developer for approval before destructive operations. Show the developer, the Node.js process, the Anthropic API, the four core tools, the MCP servers, and the CLAUDE.md context injection path."

Cursor Agent architecture

Cursor is an IDE fork of VS Code with deep AI integration. Its Agent mode turns the AI from an autocomplete assistant into a loop that can read, write, and run code across an entire codebase.

Core components

Codebase indexing: Cursor indexes the entire codebase into a vector store on disk. When the agent needs to find relevant code, it embeds the query and does approximate nearest-neighbor search — returning the most relevant snippets without reading every file. This is how Cursor handles large codebases that exceed a single context window.
Context assembly: Cursor assembles context from: the user's prompt, explicitly @-mentioned files, the vector-retrieved relevant snippets, recent conversation turns, and the current file. The context budget is managed automatically.
Model routing: Cursor supports multiple model backends — Claude claude-opus-4-8, GPT-4o, Gemini Pro, and Cursor's own cursor-small model for fast edits. The user or workspace config selects which model handles which task.
Tool set: Cursor Agent's tools include: file read/write/create/delete, terminal command execution, web search (via a built-in browser), and linter/type-checker invocation. In 2026, Cursor supports MCP server connections, allowing custom tool extensions.
Diff application: Cursor does not replace files atomically — it generates diffs and applies them via a built-in merge engine that handles conflicts and shows changes inline before accepting.

Diagram prompt: Cursor Agent architecture

"Cursor Agent IDE architecture. A developer uses the Cursor IDE (VS Code fork). In Agent mode, the developer's request flows into a context assembler that combines: (1) the user prompt; (2) @-mentioned files; (3) vector search results from the local codebase index (embedding model + ANN search); (4) conversation history. The assembled context goes to a model router that selects Claude claude-opus-4-8, GPT-4o, or Gemini Pro based on workspace settings. The LLM returns a plan with tool calls. The tool executor runs: file read/write/create/delete operations on the local repo; terminal commands via a built-in shell; web search; linter invocations. Tool results are appended to context and the loop repeats. File changes are applied as diffs through a merge engine that shows inline diffs to the developer before acceptance. Show the codebase index as a persistent vector store, the model router as a decision node, and the diff application as the final step before code lands in the repo."

GitHub Copilot Agent architecture

GitHub Copilot Agent (launched in 2025) runs as a GitHub Actions workflow triggered by issue assignments or pull request comments. Unlike Claude Code and Cursor, it operates autonomously in a cloud CI environment rather than on a developer's local machine.

Core components

Trigger mechanism: Copilot Agent is invoked when a GitHub issue is assigned to "Copilot" or when a comment in a PR or issue includes @copilot with a task description.
GitHub Actions runner: The agent runs inside a GitHub Actions workflow on a fresh Ubuntu runner. It has full access to the repository, can install dependencies, run tests, and interact with the GitHub API.
Model: Copilot Agent uses GPT-4o (and optionally Claude claude-opus-4-8 via Copilot Extensions) as its reasoning engine via the GitHub Models API.
GitHub API tools: Copilot Agent has native access to GitHub API tools — read/write issues, comments, PRs, branches, files in the repo, and CI check statuses. This is its primary integration surface and is more tightly integrated than Claude Code or Cursor.
Output: pull request: When the task is complete, Copilot Agent opens a pull request with its changes, a description of what it did, and a test plan. The developer reviews and merges — the human remains in control of what lands in main.

Diagram prompt: GitHub Copilot Agent architecture

"GitHub Copilot Agent cloud architecture. A developer assigns a GitHub issue to Copilot. This triggers a GitHub Actions workflow on a cloud Ubuntu runner. The Copilot Agent process runs on the runner. It sends the issue description plus codebase context to the GitHub Models API (GPT-4o). The LLM responds with a plan and tool calls. Tools include: (1) GitHub API — read/write issue comments, PR creation, file commits, branch management; (2) Shell executor — runs tests, linters, build commands; (3) File read/write — edits source files in the cloned repo. The agent iterates: read repo → plan → implement → run tests → fix failures → repeat until tests pass. On success, the agent opens a Pull Request against the target branch with a description of the changes. The developer reviews the PR, requests changes, or merges. Show the trigger path from issue assignment through Actions workflow to the agent, the GitHub Models API, the tool loop, and the final PR output."

Architectural comparison

Dimension	Claude Code	Cursor Agent	GitHub Copilot Agent
Execution environment	Local terminal	Local IDE	GitHub Actions (cloud)
Primary model	Claude claude-opus-4-8 / claude-sonnet-4-6	Claude / GPT-4o / Gemini (selectable)	GPT-4o (via GitHub Models)
Codebase indexing	On-the-fly file reads	Local vector index (ANN search)	Full repo clone on runner
MCP support	Native (first-class MCP client)	Yes (2026)	Via Copilot Extensions
Extensibility	Any MCP server, custom hooks	MCP servers, VS Code extensions	Copilot Extensions (custom agents)
Human-in-the-loop	Per-tool-call approval prompts	Inline diff review before applying	PR review before merge
Best for	Full-codebase tasks, custom toolchains	IDE-centric multi-file edits	Async issue-driven tasks, CI integration

Context window management: the key architectural challenge

The hardest problem in coding agent design is context management. A large production codebase can have millions of tokens of source code — far more than any LLM context window. Each tool handles this differently:

Claude Code reads files lazily — the agent reads only the files it decides are relevant, using search and grep tools to navigate rather than loading the full codebase. The model determines relevance through reasoning, not pre-indexing.
Cursor pre-indexes the codebase into a local vector store, so relevant snippets are retrieved semantically before context is assembled. This shifts the relevance problem from the LLM's reasoning to the embedding model and ANN search quality.
GitHub Copilot Agent works on a full git clone in a clean CI environment. It has access to the entire repo but must still manage what it puts in the LLM context window. It uses the repo structure and git history to guide what to read.

Each approach has tradeoffs: lazy reads are accurate but may miss relevant context; vector indexing is fast but can miss semantically unusual code; full-repo access needs good navigation heuristics to avoid context bloat.

Frequently asked questions

What is the difference between an AI coding assistant and an AI coding agent?

An AI coding assistant (like early Copilot or Codeium) responds to a single prompt and produces a single output — it autocompletes a function, answers a question, or generates a snippet. An AI coding agent runs a multi-step loop: it reads files, writes files, runs commands, observes the results, and iterates until a task is complete. The key distinction is that agents take actions and observe feedback; assistants respond to a single turn.

Can I build my own AI coding agent?

Yes. The simplest approach is the Anthropic API with tool use enabled: define a set of tools (file read, file write, bash exec), pass them to Claude with the tools parameter, and run a loop that calls the API, executes whichever tool Claude requests, and feeds the result back. The Claude Code SDK provides a higher-level abstraction for building agents that integrate with the Claude Code environment. For complex multi-agent systems, LangGraph or the Anthropic Agent SDK provide graph-based orchestration.

What is MCP and why does it matter for coding agents?

MCP (Model Context Protocol) is an open protocol from Anthropic that standardizes how AI agents connect to external tools and data sources. Instead of writing custom integration code for every tool, you run an MCP server that exposes the tool in a standard schema — and any MCP client (Claude Code, Cursor, any MCP-compatible agent) can discover and use it automatically. This is the same idea as OpenAPI for REST APIs, applied to AI tools. Claude Code is a native MCP client; connecting it to a GitHub MCP server, for example, gives the agent the full GitHub API as tools without any code changes.

How do I diagram an AI coding agent architecture?

Describe the agent's components in plain English — the LLM, the tool set, the context assembly pipeline, the execution environment, and the human approval checkpoints — and paste the description into ArchitectureDiagram.ai. Use the prompt templates above as starting points. For MCP-heavy setups, explicitly label each MCP server and the tools it exposes.

Ready to try it yourself?

Start Creating - Free