LangGraph: Building Stateful Agentic Workflows That Actually Work

The Problem with Simple Agent Loops

The basic ReAct loop — observe, think, act, repeat — works fine for toy demos. Give an LLM a list of tools, point it at a task, watch it reason its way to an answer. Clean. Simple.

Then you try to run it on anything real.

The agent hallucinates a tool call. It gets stuck in a loop. It succeeds on step 8 but you have no way to checkpoint and resume from step 7 if something downstream fails. You need a human to review an intermediate result before the agent continues, but there's no hook for that. You need two agents coordinating, but they have no shared state.

LangGraph is the answer to all of these. It models your agent as an explicit graph — nodes are functions, edges are control flow, and state is a typed object that flows through the whole thing. You get cycles, branching, checkpointing, and human-in-the-loop for free.

Core Concepts

State

Everything in LangGraph flows through a state object. You define it as a TypedDict:

from typing import Annotated
from typing_extensions import TypedDict
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    tool_calls_made: int
    requires_human_review: bool

add_messages is a reducer — it appends new messages rather than replacing the list. You can write your own reducers for any field that needs merge semantics instead of overwrite.

Nodes and Edges

Nodes are plain Python functions that take state and return a partial state update:

from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import ToolNode

llm = ChatAnthropic(model="claude-sonnet-4-6")
llm_with_tools = llm.bind_tools(tools)

def call_model(state: AgentState) -> AgentState:
    response = llm_with_tools.invoke(state["messages"])
    return {"messages": [response], "tool_calls_made": state["tool_calls_made"] + 1}

tool_node = ToolNode(tools)

Edges connect nodes. Conditional edges let you branch based on state:

from langgraph.graph import StateGraph, END

def route(state: AgentState) -> str:
    last = state["messages"][-1]
    if state["requires_human_review"]:
        return "human_review"
    if hasattr(last, "tool_calls") and last.tool_calls:
        return "tools"
    return END

graph = StateGraph(AgentState)
graph.add_node("agent", call_model)
graph.add_node("tools", tool_node)
graph.add_node("human_review", human_review_node)

graph.set_entry_point("agent")
graph.add_conditional_edges("agent", route)
graph.add_edge("tools", "agent")
graph.add_edge("human_review", "agent")

app = graph.compile()

Checkpointing

This is LangGraph's killer feature for production. Add a checkpointer and every state transition is persisted:

from langgraph.checkpoint.memory import MemorySaver

checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)

config = {"configurable": {"thread_id": "user-session-42"}}

# Run — can resume from any checkpoint
result = app.invoke({"messages": [HumanMessage("Analyze Q3 revenue data")]}, config)

# Resume after failure or pause
state = app.get_state(config)
app.invoke(None, config)  # continues from last checkpoint

For production, swap MemorySaver for SqliteSaver or a Postgres-backed checkpointer. The interface is identical.

A Real Pattern: Research + Synthesis Agent

Here's a pattern I use often — a two-phase agent that researches with tools, pauses for human approval, then synthesizes a final report:

def should_review(state: AgentState) -> str:
    # Require human sign-off after N tool calls or if flagged
    if state["tool_calls_made"] >= 5 or state["requires_human_review"]:
        return "human_review"
    last = state["messages"][-1]
    if hasattr(last, "tool_calls") and last.tool_calls:
        return "tools"
    return "synthesize"

graph.add_conditional_edges("agent", should_review)
graph.add_node("synthesize", synthesis_node)
graph.add_edge("synthesize", END)

The human_review node interrupts the graph and surfaces the current state to a UI or Slack approval flow. Once approved, the graph resumes from that node.

When to Use LangGraph vs. Simpler Alternatives

Scenario	Use
Single-turn Q&A with tools	`create_react_agent` (prebuilt)
Multi-step with retry logic	LangGraph basic graph
Long-running with checkpoints	LangGraph + persistent checkpointer
Human-in-the-loop required	LangGraph + interrupt
Multi-agent coordination	LangGraph multi-agent with shared state

What I've Learned

The biggest shift when moving from simple chains to LangGraph is accepting that state is explicit. You can't hide it in closure variables or rely on message history alone. Everything the agent needs to know must be in the state object. This feels like overhead at first, but it's what makes the system observable, resumable, and testable.

The second thing: start with create_react_agent from langgraph.prebuilt. It covers 80% of agentic use cases with zero boilerplate. Only reach for the full graph API when you need cycles with custom branching, checkpointing, or multi-agent coordination.

LangGraph is not magic. It's a way to make the control flow of your agent as explicit and inspectable as the rest of your code. That's exactly what production systems need.