Build Production AI Agents in n8n (2026 Guide)
Design tool-using AI agents in n8n with memory, vector stores, and human-in-the-loop. Patterns that survive real traffic.
Key takeaways
- An agent = LLM + tools + memory + (optionally) retrieval.
- Always pin a system prompt that names the agent's job and forbids what it must never do.
- Use sub-workflow Tool nodes so each tool is independently testable.
- Add a human approval step before any irreversible action (send email, charge card, delete record).
n8n is now the shortest path from a blank canvas to a production AI agent that uses tools, remembers conversations, and queries your knowledge base. The Agent node wraps LangChain primitives in a visual graph so you can build, debug, and ship agents without writing an orchestration framework from scratch.
The four building blocks
AI Agent node — the orchestrator. Tool node — anything the agent can call (HTTP, sub-workflow, calculator, vector store). Memory node — Window Buffer for short chats, Postgres-backed for production. Vector Store node — Pinecone, Qdrant, Supabase pgvector, or Weaviate for RAG retrieval.
Pattern 1 — Support triage agent
Webhook trigger from your help-desk receives a new ticket. Agent node with three tools: search_kb (vector store over your docs), get_customer (HTTP to your CRM), draft_reply (sub-workflow that posts to a Slack approval channel). Memory: none — each ticket is independent. Result: 60–80% of tier-1 tickets drafted before a human reads them.
Pattern 2 — Sales research agent
Chat trigger. Agent with tools: search_web (Serper or Tavily), fetch_linkedin (HTTP Request), score_lead (sub-workflow with internal scoring rules), save_to_crm (HubSpot node). Memory: Postgres so the rep can pick up where they left off.
Pattern 3 — Internal data Q&A
Slack trigger on /ask. Vector store retriever tool over your internal wiki. The agent answers with citations. Add a feedback node that logs thumbs-up/down to a Postgres table — your evaluation set writes itself.
Human-in-the-loop is not optional
Any agent that takes external action should pause for human approval the first 90 days. Use the n8n Wait node with a Form trigger or a Slack interactive message. The agent proposes; the human approves; the workflow executes. This single pattern catches 95% of hallucinations before they hit a customer.
Evaluation — the missing discipline
Build a Postgres table of 50 golden test cases. Run a nightly workflow that fires each case through the agent and grades the output with a second LLM. Track accuracy over time. Without this, you cannot tell if your last prompt edit made the agent better or worse.
Frequently asked questions
- Which LLM should my n8n agent use?
- GPT-4o-class models for orchestration, smaller models for cheap tool calls. Anthropic Claude is excellent for long-context document Q&A.
- How do I stop the agent from looping?
- Set Max Iterations on the Agent node (default 10). Add a Stop and Error node downstream of any tool that should hard-fail.
- Can I run the agent on open-source models?
- Yes. Point the LLM node at an Ollama or vLLM endpoint. Quality drops vs frontier models but cost goes to near zero.