Back to blog

Durable Execution Architecture: Temporal, Inngest, and Restate Diagrams (2026)

How durable execution engines work — the architecture behind Temporal, Inngest, and Restate. Learn how to diagram workflows, activities, retries, signals, and the event loop that makes code fault-tolerant.

R
Ryan·Senior AI Engineer
·

Distributed systems fail. Networks partition, processes crash, and third-party APIs return errors at the worst moments. For decades, engineers solved this by hand — writing retry loops, dead-letter queues, idempotency keys, and state machines in application code. Durable execution engines take a different approach: they persist the entire execution state of a workflow automatically, so a process can resume exactly where it left off after any failure.

Temporal, Inngest, and Restate are the three leading durable execution platforms in 2026. Each takes a different architectural approach to the same core problem. This guide explains how each works, how to diagram their architectures, and when to use each.

What is durable execution?

Durable execution is a programming model where the runtime automatically persists workflow state after every step. If the process crashes, the runtime replays the workflow history from the persisted log to reconstruct the execution state, then continues from the point of failure — without any re-execution of successfully completed steps.

From the developer's perspective, you write normal sequential code. The durable execution engine handles fault tolerance transparently: retries, timeouts, deduplication, and state persistence are built into the platform, not the application. This is why developers describe it as "code that doesn't forget."

Temporal architecture

Temporal is the most widely deployed durable execution platform in production in 2026. It originated at Uber (as Cadence) and is now used by Stripe, Netflix, Coinbase, and thousands of other companies.

Core components

  • Temporal Server: The central coordination service. Persists workflow state in a database (Cassandra or PostgreSQL), routes tasks to workers, manages timers, and handles signals and queries. Can be self-hosted or used as Temporal Cloud (fully managed).
  • Task queues: Workflows and activities communicate through named task queues. The server puts tasks on the queue; workers poll the queue and execute tasks. Task queues are the unit of scaling — you scale workers per queue.
  • Workflow workers: Long-running processes that poll workflow task queues and execute workflow code. Workflow code must be deterministic — the same inputs must always produce the same sequence of decisions, because the runtime replays history to recover state.
  • Activity workers: Processes that execute activities — non-deterministic side effects like API calls, database writes, or file operations. Activities have their own task queues, independent retry policies, and heartbeat mechanisms for long-running operations.
  • Event history: Every workflow execution maintains an immutable event history log. Each step (workflow started, activity scheduled, activity completed, timer fired, signal received) is appended as an event. This history is the source of truth for replay — Temporal replays it to reconstruct workflow state without re-executing side effects.
  • Signals and queries: Signals are asynchronous messages sent into a running workflow (e.g., "payment approved"). Queries are synchronous reads of workflow state. Both are first-class primitives that enable external systems to interact with in-flight workflows.

Diagram prompt: Temporal architecture

"Temporal durable execution architecture for an order fulfillment workflow. External trigger: an e-commerce API publishes a new order event to a Temporal workflow via the Temporal SDK's StartWorkflow call. Temporal Server receives the start request and persists it to PostgreSQL as the first event in the workflow's event history. The server puts a workflow task on the 'fulfillment-workflow' task queue. A Workflow Worker (Node.js, 3 replicas on Kubernetes) polls the queue, executes the workflow code (checks inventory → reserves stock → charges payment → creates shipment → sends confirmation), and schedules activities as needed. Each activity is placed on an 'activity' task queue. Activity Workers (Node.js, 5 replicas) execute activities: (1) InventoryService.checkStock — queries Postgres; (2) PaymentService.charge — calls Stripe API; (3) ShippingService.createLabel — calls FedEx API; (4) EmailService.sendConfirmation — calls SendGrid. Activity results are returned to the Temporal Server, which appends them to the event history and puts a new workflow task on the queue. Show the event history as a persistent log, the two task queues, the worker poll loops, and the external services. Add a signal path: 'CancelOrder' signal from the frontend arrives at Temporal Server and is delivered to the workflow via signal handler."

Inngest architecture

Inngest takes a serverless-first approach to durable execution. Instead of running persistent worker processes, Inngest functions run as serverless functions (Next.js API routes, AWS Lambda, Cloudflare Workers) and communicate with the Inngest platform via HTTP. This makes it dramatically simpler to adopt in a serverless stack — no worker infrastructure to operate.

Core components

  • Inngest Cloud: The managed coordination service that receives events, stores execution state, manages retries and backoff, and orchestrates step execution. Can also be self-hosted with the open-source Inngest Dev Server.
  • Serverless functions: Each Inngest function is a serverless function that runs in your own infrastructure. Functions register themselves with Inngest at startup via an HTTP handler (/api/inngest). Inngest invokes them via HTTP POST when their trigger fires.
  • Step functions: Inside an Inngest function, individual steps are defined with step.run(), step.sleep(), and step.waitForEvent(). After each step completes, Inngest stores the result. On retry, the function replays completed steps from cache rather than re-executing them — the same replay model as Temporal, without the worker infrastructure.
  • Event-driven triggers: Inngest functions are triggered by events (sent via the Inngest SDK), cron schedules, or step.waitForEvent() calls that pause a workflow until a matching event arrives.

Diagram prompt: Inngest serverless durable execution

"Inngest serverless durable execution architecture. A Next.js application on Vercel has an /api/inngest route that serves as the Inngest SDK handler — it registers two functions: onboarding-workflow and payment-failure-handler. Event triggers: (1) user.signed_up event sent from the auth callback route via inngest.send() triggers onboarding-workflow; (2) payment.failed event from a Stripe webhook triggers payment-failure-handler. Inngest Cloud receives events and orchestrates execution: it sends HTTP POST requests to /api/inngest to run each step. The onboarding-workflow has four steps: step.run 'send-welcome-email' (calls Resend API); step.sleep '7 days'; step.waitForEvent 'user.completed_profile' (timeout 30 days); step.run 'send-tips-email'. Between steps, Inngest stores the step result and the function exits (no persistent process). On re-invocation for the next step, completed steps are replayed from cached results. Show Inngest Cloud as the orchestrator, the Vercel function as the executor, the event ingestion path, the step state storage, and the external service calls (Resend, Stripe)."

Restate architecture

Restate is a newer durable execution runtime (open-source, launched in 2024) that focuses on consistency and developer experience. It runs as a lightweight sidecar or managed service and supports RPC-style service-to-service calls with automatic durability — not just async workflows, but also synchronous request handlers that survive crashes.

Core components

  • Restate Server: Acts as a durable proxy in front of your services. All calls to your services pass through Restate, which logs them in an append-only journal before forwarding. This journal is what enables replay and fault tolerance.
  • Virtual objects: Restate's unit of state is a virtual object — a keyed entity (like a user ID or order ID) with durable handlers. Calls to the same key are serialized by Restate, preventing concurrent mutations. This solves the dual-write problem without distributed locks.
  • SDK-based handlers: Business logic lives in your own services (Node.js, JVM, Go, Python), written using the Restate SDK. The SDK instruments the code so that every ctx.run() call is checkpointed — if the service crashes mid-handler, Restate re-invokes the handler and replays already-completed steps from the journal.

Diagram prompt: Restate durable execution

"Restate durable execution architecture for a payment processing service. Client sends a POST /payments/charge/{userId} to the Restate Server (deployed as a Kubernetes sidecar alongside the payment service). Restate writes the invocation to its append-only journal (RocksDB on the same pod). It forwards the request to the PaymentService (Node.js, using Restate SDK). The PaymentService handler has three ctx.run steps: (1) idempotency check against Postgres; (2) Stripe charge API call; (3) write confirmation to Postgres. If the service crashes between steps, Restate re-invokes the handler; the Restate SDK replays completed ctx.run calls from the journal without re-executing the Stripe call. Virtual object semantics: concurrent requests for the same userId are serialized by Restate — only one handler runs at a time per user. Show the Restate Server as a durable proxy, the journal as a persistent log, the PaymentService with the three ctx.run steps, the Stripe API call, and the serialization guarantee for concurrent requests on the same key."

Choosing between Temporal, Inngest, and Restate

CriteriaTemporalInngestRestate
Infrastructure modelPersistent worker processesServerless functionsSidecar / proxy
Best forComplex long-running workflowsEvent-driven serverless appsTransactional services with concurrency
Managed optionTemporal CloudInngest Cloud (default)Restate Cloud (2025)
State modelEvent history replayStep result cachingJournal-backed handlers
Language supportGo, Java, Python, TypeScript, .NET, PHPTypeScript, PythonTypeScript, Java/Kotlin, Go, Python, Rust
Operational complexityHigh (self-hosted) / Low (Temporal Cloud)Very low (fully managed)Low (sidecar pattern)

Durable execution for AI agents

AI agent workflows are a natural fit for durable execution. Multi-step agents — a planner that creates a task list, dispatches subtasks to worker agents, waits for results, and synthesizes a final response — can run for minutes or hours and make dozens of external API calls. Without durable execution, a crash at step 9 of 12 means starting over and re-paying for all the LLM calls. With durable execution, the workflow resumes from step 9.

In 2026, Temporal and Inngest are both commonly used as the orchestration layer beneath AI agent frameworks like LangGraph and custom Anthropic SDK agents. The pattern: each agent step is a Temporal activity or Inngest step; the overall agent loop is a Temporal workflow or Inngest function. Long-running agent jobs are safe to deploy against real workloads because failures at any step are automatically retried without duplicating completed work.

Frequently asked questions

What is the difference between durable execution and a message queue?

A message queue (SQS, RabbitMQ, Kafka) delivers messages between services but does not track the state of a multi-step workflow. If a consumer crashes after processing step 3 of a 10-step workflow, the queue has no concept of where it left off — you either re-process from the beginning or implement state tracking yourself. Durable execution engines persist the full workflow state after every step, so they can resume from exactly the point of failure without any application-level state management code.

When should I use durable execution instead of a state machine?

State machines (XState, AWS Step Functions, event sourcing with CQRS) are a good fit when your workflow has a small, well-defined set of states and transitions. Durable execution shines when your workflow contains sequential code with many steps, external API calls, sleep periods, or human approval checkpoints — where modeling as an explicit state machine would produce 50 states and be unmaintainable. If you find yourself writing a state machine with more than 15 states, durable execution is likely the better abstraction.

Related guides: Temporal workflow architecture, event sourcing and CQRS architecture, multi-agent orchestration patterns, and serverless architecture diagrams.

Ready to try it yourself?

Start Creating - Free