475 Cumulus
Guide

Build an agent with LangChain — a practical tutorial

Step-by-step guide to building a tool-calling agent with LangChain and LangGraph, from first prototype to patterns that survive production.

agentslangchainintegrationtool-calling

An agent is an LLM that can decide which actions to take, call tools against your systems, observe the results, and loop until it has enough information to answer. LangChain's ecosystem — especially LangGraph — gives you a structured way to build that loop without hand-rolling state management on every project.

This tutorial walks through a working agent in Python: define tools, wire a ReAct loop, add memory, then cover what changes when you move from notebook to production inside an existing web app.

What you will build

A small support research agent that can:

  1. Look up a customer account from a mock API
  2. Search internal documentation
  3. Summarize findings for a human agent

The agent uses the ReAct pattern (Reason + Act): the model thinks, picks a tool, reads the result, and repeats until it responds.

Prerequisites

  • Python 3.11+
  • An OpenAI, Anthropic, or other provider API key
  • ~30 minutes

Install dependencies:

pip install langgraph langchain-openai langchain-core python-dotenv

Create a .env file:

OPENAI_API_KEY=sk-...

Step 1: Define tools

Tools are plain Python functions the model can invoke. Use the @tool decorator so LangChain generates a schema the model understands.

from langchain_core.tools import tool
 
@tool
def get_account(account_id: str) -> str:
    """Fetch account details by ID. Use when the user mentions a customer or account number."""
    # In production, call your CRM or billing API here — with auth and tenant scoping.
    accounts = {
        "acme-001": {"name": "Acme Corp", "plan": "Enterprise", "mrr": 4200},
        "beta-002": {"name": "Beta Labs", "plan": "Pro", "mrr": 890},
    }
    account = accounts.get(account_id)
    if not account:
        return f"No account found for {account_id}"
    return str(account)
 
 
@tool
def search_docs(query: str) -> str:
    """Search internal documentation. Use for policy, billing, or feature questions."""
    docs = {
        "refund": "Refunds within 30 days for annual plans. Pro-rata for monthly.",
        "sso": "SSO available on Enterprise. SAML and OIDC supported.",
        "api limits": "Pro: 1k req/min. Enterprise: custom limits with SLA.",
    }
    query_lower = query.lower()
    for key, text in docs.items():
        if key in query_lower:
            return text
    return "No matching documentation found."

Keep tool descriptions specific. The model chooses tools based on names and docstrings — vague descriptions lead to wrong calls.

Step 2: Create the agent with LangGraph

LangGraph's create_react_agent builds the full ReAct loop — LLM node, tool node, conditional routing — in one call.

from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
 
tools = [get_account, search_docs]
 
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
 
agent = create_react_agent(
    model=llm,
    tools=tools,
    prompt=(
        "You are a support research assistant. "
        "Use tools to look up facts before answering. "
        "If you cannot find information, say so — do not guess."
    ),
)

Run it:

result = agent.invoke({
    "messages": [
        {
            "role": "user",
            "content": "Account acme-001 asked about refunds. What is their plan and what is our policy?",
        }
    ]
})
 
# Print the final assistant message
print(result["messages"][-1].content)

Expected behavior: the agent calls get_account("acme-001"), then search_docs("refund"), then synthesizes an answer citing both.

Step 3: Understand the loop

Under the hood, create_react_agent compiles a graph that looks like this:

User message → Agent (LLM) → Tool calls? → Tools → Agent → … → Final answer
StepWhat happens
Agent nodeLLM receives messages + tool schemas; returns text or tool calls
Tools nodeExecutes selected tools; appends results as tool messages
RoutingIf tools were called, loop back to agent; otherwise, end

When you need custom behavior — approval gates before destructive tools, parallel branches, or retrieval as a separate node — build the graph manually with StateGraph:

from langgraph.graph import StateGraph, START, END, MessagesState
from langgraph.prebuilt import ToolNode, tools_condition
 
def agent_node(state: MessagesState):
    response = llm.bind_tools(tools).invoke(state["messages"])
    return {"messages": [response]}
 
builder = StateGraph(MessagesState)
builder.add_node("agent", agent_node)
builder.add_node("tools", ToolNode(tools))
builder.add_edge(START, "agent")
builder.add_conditional_edges("agent", tools_condition)
builder.add_edge("tools", "agent")
 
graph = builder.compile()

Use the prebuilt helper until this level of control is justified.

Step 4: Add conversation memory

Agents in product UIs need thread history. LangGraph checkpointing persists state between turns.

from langgraph.checkpoint.memory import MemorySaver
 
memory = MemorySaver()
 
agent = create_react_agent(
    model=llm,
    tools=tools,
    checkpointer=memory,
)
 
config = {"configurable": {"thread_id": "support-ticket-8842"}}
 
# Turn 1
agent.invoke(
    {"messages": [("user", "Look up account acme-001")]},
    config=config,
)
 
# Turn 2 — same thread, agent remembers context
agent.invoke(
    {"messages": [("user", "Now check refund policy for them")]},
    config=config,
)

In production, swap MemorySaver for a persistent checkpointer (Postgres, Redis) so state survives process restarts and works across serverless instances.

Step 5: Expose it behind your API

Do not call the agent directly from a browser. Wrap it in server-side middleware — the same pattern as any LLM feature in your stack.

# FastAPI example (conceptual)
from fastapi import FastAPI, Depends
from pydantic import BaseModel
 
app = FastAPI()
 
class ChatRequest(BaseModel):
    thread_id: str
    message: str
 
@app.post("/api/support-agent")
async def chat(req: ChatRequest, user=Depends(get_current_user)):
    # 1. Auth — user must have support role
    # 2. Rate limit per user / tenant
    # 3. Invoke agent with thread_id scoped to tenant
    config = {"configurable": {"thread_id": f"{user.tenant_id}:{req.thread_id}"}}
    result = agent.invoke(
        {"messages": [("user", req.message)]},
        config=config,
    )
    return {"reply": result["messages"][-1].content}
Request flow through LLM middleware

Client UI

Copilot, search, actions

Your API

Existing auth session

middleware

LLM middleware

Auth, rate limits, logging

Model provider

OpenAI, Anthropic, etc.

Inject tenant-scoped context
Enforce tool permissions
Record tokens & latency

Every model call passes through your stack — not around it.

Key integration points:

  • Auth before invoke — the agent never runs without a verified identity
  • Tenant-scoped thread IDs — prevent cross-customer memory leakage
  • Structured logging — log tool calls, latency, and token usage per request
  • Timeouts and step limits — cap agent loops so runaway tool chains cannot burn budget

Step 6: Add guardrails

A tutorial agent is permissive. Production agents need boundaries:

GuardrailImplementation
Max tool iterationsLangGraph recursion_limit on invoke
Allowed tools per roleFilter tool list based on user permissions
Confirmation gatesCustom graph node that pauses before destructive tools
Output validationPydantic schema or second-pass check on structured responses
Refusal on missing contextSystem prompt + eval tests for "I don't know" cases
result = agent.invoke(
    {"messages": [("user", message)]},
    config=config,
    recursion_limit=10,  # stop after 10 graph steps
)

Common mistakes

Calling the model from the client. Tool definitions and API keys stay on the server. The UI sends messages; middleware runs the graph.

Tools that bypass authorization. If get_account ignores tenant filters, the agent becomes a prompt-injection path to data leakage.

No evals. Agent behavior is non-deterministic. Maintain a golden set of inputs and assert properties: correct tool called, citation present, refuses when data missing.

Unbounded loops. A confused model can call tools repeatedly. Set recursion limits and alert on high step counts.

LangChain in TypeScript

LangChain and LangGraph also ship JavaScript/TypeScript packages (@langchain/core, @langchain/langgraph). The concepts map directly — tools, graphs, checkpointing — which makes it straightforward to embed agents in Next.js API routes or Node services. The production constraints are identical: server-side execution, auth, rate limits, observability.

When LangChain is the right choice

LangChain/LangGraph fits when you need:

  • Multi-step tool loops with explicit state
  • Human-in-the-loop approval nodes
  • Persistent conversation threads with checkpointing
  • Complex routing (multiple agents, conditional branches)

For a single-turn Q&A with retrieval, a thinner stack — middleware + RAG + one model call — is often enough. See RAG without the platform rewrite.

For provider routing, streaming, and cost controls shared across features, extract an LLM middleware layer regardless of which orchestration library you pick.

Next steps

From here, typical progression looks like:

  1. Replace mock tools with calls to your staging APIs
  2. Add evals — 20–30 golden questions with expected tool usage
  3. Wire streaming — LangGraph supports streaming graph events to the UI
  4. Ship behind a feature flag — internal users first, measure tool failure rate and cost per session

Building an agent is the easy part. Making it permissioned, observable, and cost-bounded inside your product is the integration work. Get in touch if you want help productionizing a LangChain agent in your stack — or compare LangGraph against lighter patterns for your use case.