Build an agent with LangChain — a practical tutorial
Step-by-step guide to building a tool-calling agent with LangChain and LangGraph, from first prototype to patterns that survive production.
An agent is an LLM that can decide which actions to take, call tools against your systems, observe the results, and loop until it has enough information to answer. LangChain's ecosystem — especially LangGraph — gives you a structured way to build that loop without hand-rolling state management on every project.
This tutorial walks through a working agent in Python: define tools, wire a ReAct loop, add memory, then cover what changes when you move from notebook to production inside an existing web app.
What you will build
A small support research agent that can:
- Look up a customer account from a mock API
- Search internal documentation
- Summarize findings for a human agent
The agent uses the ReAct pattern (Reason + Act): the model thinks, picks a tool, reads the result, and repeats until it responds.
Prerequisites
- Python 3.11+
- An OpenAI, Anthropic, or other provider API key
- ~30 minutes
Install dependencies:
pip install langgraph langchain-openai langchain-core python-dotenvCreate a .env file:
OPENAI_API_KEY=sk-...Step 1: Define tools
Tools are plain Python functions the model can invoke. Use the @tool decorator so LangChain generates a schema the model understands.
from langchain_core.tools import tool
@tool
def get_account(account_id: str) -> str:
"""Fetch account details by ID. Use when the user mentions a customer or account number."""
# In production, call your CRM or billing API here — with auth and tenant scoping.
accounts = {
"acme-001": {"name": "Acme Corp", "plan": "Enterprise", "mrr": 4200},
"beta-002": {"name": "Beta Labs", "plan": "Pro", "mrr": 890},
}
account = accounts.get(account_id)
if not account:
return f"No account found for {account_id}"
return str(account)
@tool
def search_docs(query: str) -> str:
"""Search internal documentation. Use for policy, billing, or feature questions."""
docs = {
"refund": "Refunds within 30 days for annual plans. Pro-rata for monthly.",
"sso": "SSO available on Enterprise. SAML and OIDC supported.",
"api limits": "Pro: 1k req/min. Enterprise: custom limits with SLA.",
}
query_lower = query.lower()
for key, text in docs.items():
if key in query_lower:
return text
return "No matching documentation found."Keep tool descriptions specific. The model chooses tools based on names and docstrings — vague descriptions lead to wrong calls.
Step 2: Create the agent with LangGraph
LangGraph's create_react_agent builds the full ReAct loop — LLM node, tool node, conditional routing — in one call.
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
tools = [get_account, search_docs]
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
agent = create_react_agent(
model=llm,
tools=tools,
prompt=(
"You are a support research assistant. "
"Use tools to look up facts before answering. "
"If you cannot find information, say so — do not guess."
),
)Run it:
result = agent.invoke({
"messages": [
{
"role": "user",
"content": "Account acme-001 asked about refunds. What is their plan and what is our policy?",
}
]
})
# Print the final assistant message
print(result["messages"][-1].content)Expected behavior: the agent calls get_account("acme-001"), then search_docs("refund"), then synthesizes an answer citing both.
Step 3: Understand the loop
Under the hood, create_react_agent compiles a graph that looks like this:
User message → Agent (LLM) → Tool calls? → Tools → Agent → … → Final answer| Step | What happens |
|---|---|
| Agent node | LLM receives messages + tool schemas; returns text or tool calls |
| Tools node | Executes selected tools; appends results as tool messages |
| Routing | If tools were called, loop back to agent; otherwise, end |
When you need custom behavior — approval gates before destructive tools, parallel branches, or retrieval as a separate node — build the graph manually with StateGraph:
from langgraph.graph import StateGraph, START, END, MessagesState
from langgraph.prebuilt import ToolNode, tools_condition
def agent_node(state: MessagesState):
response = llm.bind_tools(tools).invoke(state["messages"])
return {"messages": [response]}
builder = StateGraph(MessagesState)
builder.add_node("agent", agent_node)
builder.add_node("tools", ToolNode(tools))
builder.add_edge(START, "agent")
builder.add_conditional_edges("agent", tools_condition)
builder.add_edge("tools", "agent")
graph = builder.compile()Use the prebuilt helper until this level of control is justified.
Step 4: Add conversation memory
Agents in product UIs need thread history. LangGraph checkpointing persists state between turns.
from langgraph.checkpoint.memory import MemorySaver
memory = MemorySaver()
agent = create_react_agent(
model=llm,
tools=tools,
checkpointer=memory,
)
config = {"configurable": {"thread_id": "support-ticket-8842"}}
# Turn 1
agent.invoke(
{"messages": [("user", "Look up account acme-001")]},
config=config,
)
# Turn 2 — same thread, agent remembers context
agent.invoke(
{"messages": [("user", "Now check refund policy for them")]},
config=config,
)In production, swap MemorySaver for a persistent checkpointer (Postgres, Redis) so state survives process restarts and works across serverless instances.
Step 5: Expose it behind your API
Do not call the agent directly from a browser. Wrap it in server-side middleware — the same pattern as any LLM feature in your stack.
# FastAPI example (conceptual)
from fastapi import FastAPI, Depends
from pydantic import BaseModel
app = FastAPI()
class ChatRequest(BaseModel):
thread_id: str
message: str
@app.post("/api/support-agent")
async def chat(req: ChatRequest, user=Depends(get_current_user)):
# 1. Auth — user must have support role
# 2. Rate limit per user / tenant
# 3. Invoke agent with thread_id scoped to tenant
config = {"configurable": {"thread_id": f"{user.tenant_id}:{req.thread_id}"}}
result = agent.invoke(
{"messages": [("user", req.message)]},
config=config,
)
return {"reply": result["messages"][-1].content}Client UI
Copilot, search, actions
Your API
Existing auth session
LLM middleware
Auth, rate limits, logging
Model provider
OpenAI, Anthropic, etc.
Every model call passes through your stack — not around it.
Key integration points:
- Auth before invoke — the agent never runs without a verified identity
- Tenant-scoped thread IDs — prevent cross-customer memory leakage
- Structured logging — log tool calls, latency, and token usage per request
- Timeouts and step limits — cap agent loops so runaway tool chains cannot burn budget
Step 6: Add guardrails
A tutorial agent is permissive. Production agents need boundaries:
| Guardrail | Implementation |
|---|---|
| Max tool iterations | LangGraph recursion_limit on invoke |
| Allowed tools per role | Filter tool list based on user permissions |
| Confirmation gates | Custom graph node that pauses before destructive tools |
| Output validation | Pydantic schema or second-pass check on structured responses |
| Refusal on missing context | System prompt + eval tests for "I don't know" cases |
result = agent.invoke(
{"messages": [("user", message)]},
config=config,
recursion_limit=10, # stop after 10 graph steps
)Common mistakes
Calling the model from the client. Tool definitions and API keys stay on the server. The UI sends messages; middleware runs the graph.
Tools that bypass authorization. If get_account ignores tenant filters, the agent becomes a prompt-injection path to data leakage.
No evals. Agent behavior is non-deterministic. Maintain a golden set of inputs and assert properties: correct tool called, citation present, refuses when data missing.
Unbounded loops. A confused model can call tools repeatedly. Set recursion limits and alert on high step counts.
LangChain in TypeScript
LangChain and LangGraph also ship JavaScript/TypeScript packages (@langchain/core, @langchain/langgraph). The concepts map directly — tools, graphs, checkpointing — which makes it straightforward to embed agents in Next.js API routes or Node services. The production constraints are identical: server-side execution, auth, rate limits, observability.
When LangChain is the right choice
LangChain/LangGraph fits when you need:
- Multi-step tool loops with explicit state
- Human-in-the-loop approval nodes
- Persistent conversation threads with checkpointing
- Complex routing (multiple agents, conditional branches)
For a single-turn Q&A with retrieval, a thinner stack — middleware + RAG + one model call — is often enough. See RAG without the platform rewrite.
For provider routing, streaming, and cost controls shared across features, extract an LLM middleware layer regardless of which orchestration library you pick.
Next steps
From here, typical progression looks like:
- Replace mock tools with calls to your staging APIs
- Add evals — 20–30 golden questions with expected tool usage
- Wire streaming — LangGraph supports streaming graph events to the UI
- Ship behind a feature flag — internal users first, measure tool failure rate and cost per session
Building an agent is the easy part. Making it permissioned, observable, and cost-bounded inside your product is the integration work. Get in touch if you want help productionizing a LangChain agent in your stack — or compare LangGraph against lighter patterns for your use case.
Related resources
When not to use RAG
RAG is the default answer for every AI feature — but often the wrong one. A decision guide for engineering leaders scoping retrieval, tools, and middleware.
AI integration services — what we build and how we deliver
A practical overview of 475 Cumulus capabilities, engagement phases, and how we integrate LLM features into existing products without a platform rewrite.