Product Series • Post 1

Building a Minimal Agent Runtime: ReAct Loops, State Graphs, and Symbolic Chaining

Haranadh Gavara (AWS SA Pro TOGAF® PMP®)

AI Systems Leader | Enterprise Architecture, Governed AI Platforms, Retrieval Systems, and Data Products

May 20, 2026 • 8 min read • Agentic Runtime

When I began developing a minimal agent runtime, my intention was not to replace established frameworks. Solutions such as LangChain, LangGraph, and Semantic Kernel already address many practical challenges.

These frameworks are valuable for teams requiring integrations, production patterns, observability, and ecosystem support. However, my focus was to understand the core architectural components underlying these systems.

What is the minimal structure required to run an agent safely? What essential elements must an agent runtime include before evolving into a full framework?

This article reflects that implementation process. The initial version emphasized runtime state, a non-blocking ReAct loop, state graph orchestration, and symbolic chaining. The runtime codebase is intentionally minimal to ensure the core remains easy to understand, test, and extend.

This is the first article in a five-part implementation series. The series aims to go beyond conceptual discussions by providing practical, code-focused explanations of each aspect of the runtime design. In the fifth article, I will provide a production-ready Git repository that enables readers to review the complete implementation, run it locally, and adapt it to their needs.

This first article focuses on three core patterns:

A non-blocking ReAct loop
A lightweight state graph for orchestration
Symbolic chaining to make pipelines readable

Together, these patterns form a concise yet effective foundation for agentic systems.

Core Concept: Deterministic Shell, Probabilistic Core

An agent runtime consists of two components. The probabilistic side, represented by the LLM, handles reasoning, decision-making, drafting responses, and requesting tools. The deterministic side is the runtime itself, which manages state, tool execution, approvals, retries, limits, and termination.

This separation is essential. The LLM determines the next action, while the runtime enforces what is permitted, executed, stored, and when the loop should terminate.

At its core, the runtime follows the ReAct pattern:

Receive Input
Build Current State
Invoke Model with Available Tools
Parse Response & Tool Calls
Execute Tools or Pause for Approval
Update State
Repeat until completed or stopped

While this loop appears simple, it underpins many agentic systems. The key design question extends beyond whether the model can call a tool. A more important consideration is whether the system can control tool calls, maintain state, support safe recovery, and provide clear explanations of actions. This is where runtime design becomes critical.

Runtime State as the Centre of Control

The initial design decision is to make the state explicit. Rather than distributing runtime information across multiple objects, callbacks, or temporary variables, the implementation consolidates the agent run state into a single data structure.

from dataclasses import dataclass, field
from typing import Any

@dataclass
class RuntimeState:
    run_id: str
    messages: list[Message]
    status: RunStatus = "running"
    tool_results: list[ToolResult] = field(default_factory=list)
    pending_approvals: list[ToolCall] = field(default_factory=list)
    metadata: dict[str, Any] = field(default_factory=dict)

Although compact, this structure provides a clear control point for the runtime. The messages field records conversation and tool interaction history, while the status field indicates whether the run is active, completed, failed, or waiting.

The tool_results field stores outputs from executed tools. The pending_approvals field enables the runtime to pause before sensitive actions. The metadata field offers flexibility for tracing, policy enforcement, routing, and future extensions.

This structure also simplifies testing. Tests can instantiate a RuntimeState, pass it to the runtime, and verify state changes directly, which is more straightforward than testing systems with deeply nested state. It also facilitates recovery. If the state can be serialised, stored, and restored, the runtime is not limited to in-memory execution. This is valuable when agents are paused, resumed, audited, or migrated across environments.

The state object serves as the single source of truth for each run, which is a valuable design discipline.

Pattern 1: The Non-Blocking ReAct Loop

The core runtime operates as an asynchronous loop with a defined step limit. Each iteration invokes the model, checks for tool calls, executes tools, updates state, and determines whether to continue or complete.

A simplified version looks like this:

async def run(self, state: RuntimeState) -> RuntimeState:
    self.tracer.record(state.run_id, "run_started", {"status": state.status})
    for step in range(self.max_steps):
        # Bounded memory: trim context to prevent context window growth
        if len(state.messages) > self.max_history:
            self._trim_context(state)
        # Dynamic tool fetch: expose only tools valid for the current state
        tools = self.tool_registry.available_tools(state)
        # Model invocation with caching layer
        response = await self.model.invoke(
            messages=state.messages,
            tools=tools,
            state=state,
        )
        if response.has_tool_calls:
            # Parallel tool execution
            tool_tasks = [
                self.tool_registry.execute(tc)
                for tc in response.tool_calls
            ]
            results = await asyncio.gather(*tool_tasks)
            for result in results:
                state.tool_results.append(result)
                state.messages.append(
                    Message(
                        role="tool",
                        name=result.tool_name,
                        content=str(result.data),
                    )
                )
            continue
        # Terminal condition: model responded with final text
        state.messages.append(
            Message(role="assistant", content=response.content or "")
        )
        state.status = "completed"
        return state

Although concise, this code addresses key runtime responsibilities. The loop is constrained by max_steps to prevent indefinite execution, and context is trimmed when message history exceeds set limits. The tool registry determines tool availability at each step. The model is invoked with current messages, tools, and state. If the model requests tools, they are executed and their results are incorporated into the state. If the model does not request tools, its response is treated as the final answer. This control loop ensures the runtime manages execution around the model, rather than simply relaying prompts.

The Importance of Asynchronous Tool Execution

Tool calls are frequently I/O-bound, often involving databases, external APIs, search services, file systems, workflow systems, or internal services. When an agent requests multiple independent tools, executing them sequentially introduces unnecessary latency.

For example:

fetch customer profile
fetch repayment history
fetch risk policy

If each tool requires two seconds and they run sequentially, the runtime waits approximately six seconds. Running them in parallel reduces the wait time to that of the slowest call.

For this reason, the implementation uses:

results = await asyncio.gather(*tool_tasks)

While a minor design choice, this approach is significant. An agent runtime should avoid unnecessary blocking during independent I/O operations. Parallel tool execution maintains responsiveness without adding complexity. However, parallel execution should be applied judiciously. Some tools may depend on prior outputs, modify external systems, or require approval before execution. When tool calls are independent and safe to execute concurrently, asynchronous execution provides a practical performance benefit.

Pattern 2: State Graphs for Agent Handoffs

A single agent loop is effective for many tasks, but real-world workflows often require multiple steps or roles.

                  research agent 
                  analysis agent 
                  writer agent 
                  review agent
                

Or in an enterprise workflow:

retrieve policy → read customer profile → read repayment history → calculate risk indicators → draft recommendation → approval step → update internal record → notify downstream system → log outcome

A lightweight state graph addresses this need. The objective is not to build a complex orchestration engine, but to provide a straightforward method for passing the same RuntimeState through various nodes. Each node is an asynchronous function that receives the state and returns an updated state. The graph determines the subsequent node to execute.

class StateGraph:
    def __init__(self):
        self.nodes: Dict[
            str,
            Callable[[RuntimeState], Coroutine[Any, Any, RuntimeState]]
        ] = {}
        self.edges: Dict[str, str] = {}
        self.conditional_edges: Dict[str, Callable[[RuntimeState], str]] = {}
        self.entry_point: str | None = None

    async def run(self, state: RuntimeState) -> RuntimeState:
        if not self.entry_point:
            raise ValueError("Entry point not set")
        current_node_name = self.entry_point
        while current_node_name:
            node_action = self.nodes[current_node_name]
            # Execute node logic
            state = await node_action(state)
            # Conditional edges take priority over static edges
            if current_node_name in self.conditional_edges:
                router = self.conditional_edges[current_node_name]
                current_node_name = router(state)
            else:
                current_node_name = self.edges.get(current_node_name)
            # Stop early if execution failed or was rejected
            if state.status in ["failed", "rejected"] or not current_node_name:
                break
        return state

This graph is built on three concepts: nodes perform work, static edges define fixed transitions, and conditional edges evaluate state to select the next node dynamically. This approach is sufficient for many practical workflows.

The Value of the Router

The router enables powerful state-aware orchestration. Each node focuses solely on its task and updating the state, without needing awareness of subsequent steps. The graph then inspects the state to determine the next step.

For example, after calculating a credit risk score:

low risk → continue to automated approval
medium risk → send to manual review
high risk → reject or escalate

This approach keeps node logic clean. The agent or function responsible for risk calculation only updates the state, while the graph manages flow control. This separation simplifies future changes. If routing logic evolves, only the graph router requires modification, not the individual nodes. For this reason, I prefer to keep orchestration outside the agent node. The agent should focus on its responsibilities, while the graph manages workflow progression.

Pattern 3: Symbolic Chaining for Readability

Runtime code should remain understandable. Excessive wiring code in workflows increases maintenance complexity. To improve readability of simple pipelines, the implementation uses symbolic chaining with the >> operator, achieved in Python by overloading __rshift__.

def __rshift__(self, other):
    \"\"\"Allows syntax like: agent1 >> agent2\"\"\"
    from solidstate.graph import StateGraph
    graph = StateGraph()
    graph.add_node(self.name, self.run)
    graph.set_entry_point(self.name)
    if isinstance(other, AgentRuntime):
        graph.add_node(other.name, other.run)
        graph.add_edge(self.name, other.name)
        return graph
    return graph

This allows a workflow to be written like this:

researcher = AgentRuntime(name="Researcher", ...)
writer = AgentRuntime(name="Writer", ...)
publisher = AgentRuntime(name="Publisher", ...)

publishing_pipeline = researcher >> writer >> publisher
final_state = await publishing_pipeline.run(initial_state)

While this feature does not add runtime capabilities, it enhances readability. The pipeline is visible in a single line, which aids in reviewing, discussing, and modifying workflows. Symbolic chaining should be applied judiciously. It is most effective for simple, linear flows; complex workflows are better served by explicit graph definitions. The goal is not to introduce unnecessary complexity, but to make common patterns easier to express.

A Minimal Runtime Still Requires Boundaries

Even a minimal runtime requires clear boundaries. An unrestricted agent loop can be risky, as the model may continue reasoning indefinitely, repeatedly call tools, generate unexpected requests, or exceed token limits. For this reason, the runtime incorporates several fundamental controls from the outset:

maximum steps
context trimming
tool availability checks
runtime status
approval state
trace recording

These are not advanced enterprise features, but essential runtime responsibilities. Without them, the agent is merely a model call within a loop. With these controls, the runtime functions as a controlled execution environment. While the model provides intelligence, reliability is ensured by the surrounding runtime.

Scope Limitations of This Runtime

This runtime is intentionally minimal and does not attempt to address every production concern from the outset. A complete production-grade agent platform may need persistence, distributed execution, observability dashboards, advanced policy enforcement, human approval queues, retries, rate limits, cost controls, tenant isolation, and deployment tooling. These features are important, but can be integrated around a well-defined runtime core. The initial priority is to ensure the execution model is clear and understandable. Once the core loop, state model, tool execution, and graph routing are well-defined, the broader system gains a stronger foundation.

For this reason, the series remains implementation-focused. Rather than beginning with high-level architecture diagrams, it starts with the code path and builds outward.

Significance of This Approach

Many agent projects become difficult to manage due to unclear control structures. The model makes decisions, tools are called, and state changes occur without transparency. Callbacks update other objects, retries occur, and additional agents may take over. After several layers, it becomes challenging to answer basic questions:

What happened?
Why did it happen?
Which tool was called?
What state changed?
Can we replay this?
Can we pause before this action?
Can we recover from here?

A runtime should simplify answering these questions. This is why the design emphasizes explicit state, bounded loops, controlled tool execution, and graph-based routing. The model may remain flexible, but the runtime must maintain discipline.

Conclusion

Developing an agent runtime does not require a large framework. It can begin with a concise set of clear patterns:

A RuntimeState that captures the run
A bounded ReAct loop that invokes the model and executes tools
Async tool execution for better latency
A state graph for multi-step orchestration
Symbolic chaining for readable pipelines

This approach provides a practical foundation for agentic execution while maintaining transparency in control flow.

This is only the first part of the five-part implementation series. In the next article, I will go deeper into the runtime's external interface: dynamic tool registries and automatic schema synthesis. By the fifth article, I will integrate all components and share a production-ready Git repository containing the complete implementation. At that stage, plain Python functions evolve into structured agent tools, and the runtime connects with external systems in a controlled manner.

Product Series • Post 2

The Command Center: Dynamic Tool Registries & Automatic Schema Synthesis

Haranadh Gavara (AWS SA Pro TOGAF® PMP®)

AI Systems Leader | Enterprise Architecture, Governed AI Platforms, Retrieval Systems, and Data Products

May 30, 2026 • 8 min read • Tool Registries

If an LLM has no way to interact with the external world, it is just a calculator that predicts words.

To make it useful, we must give it tools: APIs, databases, filesystems, and custom scripts.

But there is a major engineering hurdle at the boundary between a probabilistic language model and a deterministic operating system: type safety and schema compliance.

LLMs do not call Python functions directly. They produce JSON payloads containing a tool name and its arguments. It is the runtime’s job to advertise the available tools to the model in a strict format, usually JSON Schema or OpenAPI.

The runtime must then receive the model’s call parameters, validate them, parse them, execute the actual Python function safely, catch errors, and package the result. Doing this manually for dozens of tools quickly becomes an engineering nightmare of boilerplate.

In this article, we will explore how SolidState solves this cleanly using three classic software design patterns:

The Registry Pattern
Metaprogramming and Reflection
The Command Pattern

The Registry Pattern: Decentralizing Capabilities

Instead of hardcoding tool execution in a massive nested if/else block, we use the Registry Pattern. The registry acts as a centralized catalog of tools. It decouples tool registration from tool execution.

A registry maps a unique string identifier, the tool name, to a metadata object containing the executable function, parameter definitions, safety policies, and risk levels.

# A simple representation of a registered tool's metadata

@dataclass
class RegisteredTool:
    name: str
    function: Callable
    risk_level: RiskLevel = "low"
    requires_approval: bool = False
    schema: dict | None = None

The registry exposes a clean, single-method interface for developers to register new capabilities.

# Adding a new tool is a single, clean declaration

registry.register(
    fn=transfer_funds,
    risk_level="high",
    requires_approval=True
)

By decoupling registration, we can dynamically load, disable, or audit tools at runtime without altering the core state machine logic.

The Metaprogramming Pattern: Automatic Schema Synthesis

When registration occurs, how do we get the JSON Schema that the LLM needs? Writing schemas by hand is prone to errors. More importantly, the schema can drift away from the actual Python parameters.

To solve this, we use metaprogramming and reflection. Reflection is the ability of a program to inspect its own structure at runtime. Python makes this straightforward through the inspect module and type hints.

Instead of writing a JSON schema manually, the runtime inspects the Python function:

The function name is inferred directly from fn.__name__.
The description is extracted from the function’s docstring using fn.__doc__.
The parameters are parsed using inspect.signature, which allows the runtime to determine parameter types such as str, int, and bool, and also identify which arguments are required based on whether they have default values.

The Magic of Pydantic Integration

Sometimes, a tool requires complex nested configurations. For example, creating or evaluating a user profile may involve several related fields. Rather than exposing many individual arguments, we can accept a single Pydantic model parameter.

Our schema generator can reflect Pydantic classes instantly.

class CreditProfile(BaseModel):
    score: int
    annual_income: float

async def evaluate_risk(profile: CreditProfile) -> str:
    \"\"\"Evaluates the risk factor of a credit applicant.\"\"\"
    ...

# Simply registering this function automatically generates the nested JSON schema
registry.register(evaluate_risk)

Because the runtime handles the reflection, the code becomes the schema. There is zero risk of the documentation drifting from the actual implementation.

The Command Pattern: Decoupling Execution from Definition

When the model decides to invoke a tool, it outputs a payload representing a command. For example: “Run tool evaluate_risk with arguments { 'profile': { 'score': 680, ... } }.”

In SolidState, we model this as a ToolCall and execute it through the Command Pattern. The executor is responsible for resolving the call, unpacking variables, parsing JSON, casting Pydantic classes, and handling failures cleanly.

High-Level Execution Workflow

Instead of letting tools execute unsafely, the executor runs them in a controlled sandbox.

First, there is autoboxing. If a parameter expects a Pydantic model and the LLM sends a raw JSON dictionary, the executor automatically casts that dictionary into the rich Pydantic object: CreditProfile(**raw_dict).

Second, there is async integration. The runtime detects whether the function is synchronous or asynchronous and routes it to the correct execution path, such as asyncio.gather or asyncio.to_thread.

Third, there is error isolation. Any exception thrown inside the tool, such as a database timeout or API failure, is caught, wrapped in a ToolResult marked as "error", and returned to the message stream.

# Conceptual view of isolated execution

try:
    data = await fn(**args)
    return ToolResult(status="success", data=data)
except Exception as e:
    return ToolResult(status="error", error=str(e))

Self-Healing and Error Isolation

Notice how errors are trapped within the executor. If a database connection timeout occurs inside a tool, it does not crash the agent server or the runtime thread. Instead, it returns a descriptive error result. That error is appended to the message history.

This allows the LLM to read the trace, reason about the failure, and try an alternative approach. In some cases, it can self-heal by executing a retry.

Summary: Designing for Developers

By combining the Registry Pattern for cataloging capabilities, Metaprogramming and Reflection for zero-boilerplate schema generation, and the Command Pattern for isolated execution, we create a tool system that is easy to extend, robust, and type-safe.

It is easy to extend because developers only need to write a Python function with docstrings and type hints.
It is robust because errors are caught, logged, and isolated.
It is type-safe because raw JSON data can be automatically converted into typed Pydantic models.

In the next chapter, "The Guardian at the Gate: Policy Evaluation, Intercepts, and Human-in-the-Loop Orchestration", we will explore what happens before a tool is executed: how to intercept calls, enforce safety constraints, and pause execution for human approval.

Product Series • Post 3

The Guardian at the Gate: Policy Evaluation, Intercepts, and Human-in-the-Loop Orchestration

Haranadh Gavara (AWS SA Pro TOGAF® PMP®)

AI Systems Leader | Enterprise Architecture, Governed AI Platforms, Retrieval Systems, and Data Products

June 5, 2026 • 8 min read • AI Safety

Giving an AI agent access to tools is powerful, but it also creates a new class of enterprise risk. An agent that can read documents is useful. An agent that can call APIs, update systems, or trigger workflows can create real business value.

But once an agent can act, mistakes become more serious. A wrong answer is one problem. A wrong action is another.

A similar situation came up recently in a workplace context. The exact details were different, but the failure mode was the same. An automated process was trying to “fix” or “clean up” a system condition, and its action path created a risk that was much larger than the original issue.

That is the real concern with agentic systems. If an agent is asked to “fix database performance issues,” the model may decide to call something like drop_table("users") because it reasons that empty tables are faster to search. The example is deliberately extreme, but the architectural point is practical.

That is no longer just a hallucination. That is an operational incident.

This is the key shift in agentic AI. Traditional AI systems were generally more risk-averse because they mostly produced outputs for humans or downstream systems to review. Agentic AI is different because it can reason, select tools, and execute actions.

So the answer is not to avoid agentic AI. The answer is to control it through an enterprise architecture framework. The model can propose actions, but the execution environment must enforce boundaries, approvals, audit, and recovery.

Before we jump into the implementation, it is useful to step back and look at the enterprise architecture problem.

Agentic AI is not only a model capability shift. It is an execution boundary shift. Traditional AI systems were usually advisory. They predicted, classified, summarized, recommended, or generated content.

Even when the output was wrong, the damage was often limited because a human or downstream process still had a chance to review it. Agentic AI changes that assumption.

An agent can reason, select a tool, call an API, update a record, trigger a workflow, or change the state of a system. Once that happens, the risk is no longer only about response quality. It becomes operational risk.

This is why enterprise AI cannot be governed only through prompts. A prompt can describe expected behaviour, but it cannot be the final control boundary. In an enterprise environment, control must sit in the architecture.

This is the same argument we discussed earlier in the context of bounded agentic AI. Autonomy without boundaries is not suitable for enterprise adoption. Agents need defined execution limits, approval gates, audit trails, recovery paths, and clear accountability.

In other words, agentic AI needs a control plane.

But the control plane should not remain only at the level of strategy, principles, or diagrams. It has to appear in the runtime. It has to sit between model reasoning and system execution.

The model can propose an action. The runtime must decide whether that action is allowed, rejected, or paused for human approval.

That is the bridge from enterprise AI architecture to implementation.

In this article, we’ll examine how SolidState structures tool safety using the Interceptor Pattern, Chain of Responsibility Pattern, and Stateful Asynchronous Interrupts to build human-in-the-loop mechanics.

These are not random implementation details. They are runtime expressions of the same enterprise control-plane idea.

The Interceptor Pattern is commonly used in middleware and distributed systems to add services around execution without changing the core component directly. In SolidState, the same idea is applied to tool calls. Before a tool call reaches the executor, it is intercepted by the policy layer.

The Chain of Responsibility Pattern comes from classic object-oriented design. It allows a request to pass through a sequence of handlers, where each handler can either process the request or pass it onward. In SolidState, this maps naturally to modular security and governance rules.

Stateful Asynchronous Interrupts are closer to enterprise workflow and orchestration patterns. When execution needs human approval, the runtime should not block a thread. It should persist state, pause execution, release resources, and resume later through a controlled handshake.

Together, these patterns turn agent safety into an architectural control plane. They move governance away from prompt wording and into runtime enforcement.

Pattern 1: The Interceptor Pattern

In web development, middleware inspects, modifies, or rejects HTTP requests before they reach the route handler. Agent runtimes can use the same idea for tool calls.

Before the runtime executes any ToolCall, it passes the call through a PolicyEvaluator.

The important point is simple. The model does not directly execute tools. It proposes tool calls. The runtime intercepts them, and the policy layer decides what happens next.

A policy decision can produce three outcomes:

allow: means the tool call is safe enough to execute immediately.
reject: means the tool call violates a hard boundary. The runtime cancels the action, records the refusal, and stops or raises an exception.
interrupt: means the tool call may be valid, but it carries risk. The agent pauses, the state is saved, and a human must approve before execution continues.

This gives the runtime a clear safety model. Not every risky action has to be blocked. Not every action needs manual review. Not every rule belongs in the prompt.

The runtime can apply a policy based on risk.

class PolicyEvaluator:
    def evaluate(self, tool_call: ToolCall, state: RuntimeState, tool_registry) -> PolicyDecision:
        tool = tool_registry.get(tool_call.name)

        # 1. Hard Rules: Block critical operations outright
        if tool.risk_level == "critical":
            return PolicyDecision("reject", "Critical tool is blocked by default policy")

        # 2. Configured Controls: Flag high-risk tools for manual review
        if tool.requires_approval:
            return PolicyDecision("interrupt", "Tool requires human approval")

        # 3. Dynamic Thresholds: Inspect the actual parameters
        amount = tool_call.args.get("amount") or tool_call.args.get("new_limit") or 0
        if isinstance(amount, (int, float)) and amount > 10000:
            return PolicyDecision("interrupt", f"Transaction amount (${amount}) exceeds auto-execution limit")

        return PolicyDecision("allow", "Tool call allowed")

This is a small piece of code, but architecturally it does something important. It separates intention from execution.

The LLM may intend to call a tool. The runtime decides whether that tool call can proceed. That is the boundary enterprise AI systems need.

Pattern 2: Integrating Interceptors into the Runtime Loop

The interceptor only works if the runtime loop treats policy evaluation as mandatory. Policy evaluation happens after the model outputs tool calls and before those calls are sent to the ToolExecutor.

# Inside AgentRuntime.run
if response.has_tool_calls:
    for tool_call in response.tool_calls:
        # Intercept tool call
        decision = self.policy.evaluate(tool_call, state, self.tool_registry)

        if decision.action == "reject":
            # Append rejection notice to messages so the LLM knows why it failed
            state.messages.append(
                Message(role="assistant", content=f"Tool call rejected: {decision.reason}")
            )
            state.status = "rejected"
            await self.checkpointer.save(state)
            return state # Exit early

        if decision.action == "interrupt":
            state.status = "paused"
            state.pending_approvals.append(tool_call)

    if state.status == "paused":
        # Stateful Pause: Save state immediately and return to release the thread
        await self.checkpointer.save(state)
        return state

This control loop captures a major production principle. The runtime must behave like a gatekeeper, not like a blind executor.

If a tool call is rejected, the rejection is written back into the agent state. That matters because the model receives a structured signal explaining why the action failed.

If a tool call requires approval, the runtime does not block the process. It marks the state as paused, stores the pending approval, checkpoints the state, and returns.

That last step is critical.

Why Stateful Interrupts Matter

A common anti-pattern in early agent code is this:

input("Approve tool call? Y/N")

That may work in a demo. It does not work in production.

In a SaaS backend, Slack bot, workflow engine, or distributed application, blocking a thread while waiting for a human is dangerous. The human may respond in five seconds, five hours, or never.

The client connection may disconnect. The server may restart. The process may be rescheduled.

A production AI runtime cannot depend on an in-memory blocking loop. Human approval must be treated as an asynchronous workflow.

When a tool call requires approval, the runtime saves the current state with a status of paused. Then it returns immediately. The agent effectively dies in memory, but its execution context is preserved in durable storage.

This is similar to enterprise workflow systems. A loan approval does not keep a server thread alive while waiting for a manager. A procurement workflow does not block the application until finance responds.

The workflow persists state, waits externally, and resumes when a decision arrives. Agent runtimes need the same pattern.

This is not only a safety feature. It is an enterprise architecture feature.

Pattern 3: The Resume Handshake

Once an agent is paused, it needs a secure way to wake up. That is the resume handshake.

A human may review the pending tool call through a Slack button, web portal, admin console, or API. Once the decision is submitted, the runtime resumes the agent from its checkpointed state.

async def resume_after_approval(self, run_id: str, approved: bool) -> RuntimeState:
    # 1. Reconstruct: Load state from checkpointer
    state = await self.checkpointer.load(run_id)

    if not state.pending_approvals:
        raise ValueError("No pending approvals found for this run")

    self.tracer.record(
        state.run_id,
        "approval_decision",
        {"approved": approved, "tool_calls": [tc.__dict__ for tc in state.pending_approvals]}
    )

    if not approved:
        # Human rejected: Record rejection and close the loop
        state.status = "rejected"
        state.messages.append(Message(role="assistant", content="Approval rejected by administrator."))
        state.pending_approvals = []
        await self.checkpointer.save(state)
        return state

    # Human approved: Move pending calls to active and execute
    tool_calls = state.pending_approvals
    state.pending_approvals = []
    state.status = "running"

    for tool_call in tool_calls:
        result = await self.tool_registry.execute(tool_call)
        state.tool_results.append(result)
        state.messages.append(
            Message(
                role="tool",
                name=tool_call.name,
                content=str(result.data if result.status == "success" else result.error),
            )
        )

    # 2. Persist Progress: Save state checkpoint
    await self.checkpointer.save(state)

    # 3. Continue Loop: Re-enter the primary execution loop
    return await self.run(state)

This handshake does three important things. First, it reconstructs the agent state from persistent storage. Second, it records the approval decision for auditability. Third, it either rejects the action or executes the approved tool calls.

After that, the agent continues its run.

This creates a clean separation between model reasoning, policy evaluation, human approval, tool execution, state persistence, and audit tracing.

That separation matters in enterprise AI. Once agents interact with real systems, every action needs to be explainable.

Who approved it? What was approved? What parameters were used? What did the tool return? What state did the agent continue from?

These are not minor implementation details. They are governance requirements.

Designing Governance with the Chain of Responsibility

The first version of a policy evaluator may start with simple checks. Block critical tools, require approval for high-risk tools, and interrupt large transactions.

But enterprise policy rarely stays simple.

Different teams may need different rules. Different tools may carry different risks. Different environments may have different thresholds. Different users may have different approval limits.

Hardcoding all of that into one large PolicyEvaluator becomes messy. This is where the Chain of Responsibility Pattern helps.

Instead of one monolithic evaluator, the runtime can maintain a list of security rules. Each rule checks the tool call. It either returns a decision or passes control to the next rule.

class SecurityRule(ABC):
    @abstractmethod
    def validate(self, tool_call: ToolCall, state: RuntimeState) -> PolicyDecision | None:
        pass

class BlacklistRule(SecurityRule):
    def validate(self, tool_call: ToolCall, state: RuntimeState):
        if tool_call.name in ["delete_user", "format_disk"]:
            return PolicyDecision("reject", "Banned system command")
        return None

class BudgetRule(SecurityRule):
    def validate(self, tool_call: ToolCall, state: RuntimeState):
        cost = tool_call.args.get("cost", 0)
        if cost > 500:
            return PolicyDecision("interrupt", "Cost requires budget-holder review")
        return None

The evaluator chains through these rules. The first rule is to return a winning decision. This keeps policy modular.

A security team can own blacklist rules. A finance team can own budget rules. A data governance team can own sensitive data rules. A platform team can own runtime-level safety rules.

This is the enterprise architecture parallel. A good AI runtime should not bury governance inside prompts. It should not scatter safety logic across tool code.

It should expose governance as a composable policy layer.

This distinction matters because many current agent frameworks focus heavily on model orchestration, tool registration, memory, and workflow composition. Those are important, but they do not automatically solve runtime governance.

A framework may allow tools to be registered, but that does not mean every tool call is policy-mediated. A framework may support callbacks or middleware, but that does not mean it has a clear enterprise approval model.

A framework may log events, but logging after execution is not the same as preventing unsafe execution.

SolidState addresses this gap directly. Tool calls are not treated as simple function calls. They are treated as governed execution requests.

That is why the policy evaluator sits between the model and the tool executor. That is why high-risk actions can be interrupted. That is why the state is checkpointed before waiting for human approval.

That is also why the resume handshake is explicit rather than hidden inside an ad hoc callback.

This is not just implementation hygiene. It is the difference between an agent that can call tools and an agent runtime that can operate safely inside an enterprise.

Safety Is a First-Class Architecture Concern

Safety is not an afterthought. It is not solved by adding “please do not delete tables” to the system prompt. It must be designed into the runtime.

Placing interceptors before tool execution makes the runtime safer. By validating tool parameters, it becomes more controlled. Converting blocking approvals into stateful interrupts makes it production-ready.

By using a secure resume handshake, the agent becomes auditable and resilient. It can still reason, act, and complete useful workflows, but it acts within controlled boundaries.

That is the real foundation for enterprise-grade agentic AI. Not just better prompts. Not just better models. A better execution architecture.

In the next article, “Time Travel & Memory: Persistence Engines, State Checkpointing, and Event Sourcing,” we will look at how agents survive server crashes, client disconnects, and long-running workflows by saving, loading, and auditing state across distributed backends such as Redis and AWS DynamoDB.

References

Schmidt, D., Stal, M., Rohnert, H., & Buschmann, F. Pattern-Oriented Software Architecture, Volume 2: Patterns for Concurrent and Networked Objects. Used here for the Interceptor Pattern lineage.
Gamma, E., Helm, R., Johnson, R., & Vlissides, J. Design Patterns: Elements of Reusable Object-Oriented Software. Used here for the Chain of Responsibility Pattern lineage.
Durable workflow and human-in-the-loop orchestration practices. Used here to relate stateful pause, durable execution, and resume workflows to production AI agent runtimes.

Building a Minimal Agent Runtime: ReAct Loops, State Graphs, and Symbolic Chaining

Core Concept: Deterministic Shell, Probabilistic Core

Runtime State as the Centre of Control

Pattern 1: The Non-Blocking ReAct Loop

The Importance of Asynchronous Tool Execution

Pattern 2: State Graphs for Agent Handoffs

The Value of the Router

Pattern 3: Symbolic Chaining for Readability

A Minimal Runtime Still Requires Boundaries

Scope Limitations of This Runtime

Significance of This Approach

Conclusion

The Command Center: Dynamic Tool Registries &amp; Automatic Schema Synthesis

The Registry Pattern: Decentralizing Capabilities

The Metaprogramming Pattern: Automatic Schema Synthesis

The Magic of Pydantic Integration

The Command Pattern: Decoupling Execution from Definition

High-Level Execution Workflow

Self-Healing and Error Isolation

Summary: Designing for Developers

The Guardian at the Gate: Policy Evaluation, Intercepts, and Human-in-the-Loop Orchestration

Pattern 1: The Interceptor Pattern

Pattern 2: Integrating Interceptors into the Runtime Loop

Why Stateful Interrupts Matter

Pattern 3: The Resume Handshake

Designing Governance with the Chain of Responsibility

Safety Is a First-Class Architecture Concern

References

The Command Center: Dynamic Tool Registries & Automatic Schema Synthesis