Which LLM should my n8n agent use?

GPT-4o-class models for orchestration, smaller models for cheap tool calls. Anthropic Claude is excellent for long-context document Q&A.

How do I stop the agent from looping?

Set Max Iterations on the Agent node (default 10). Add a Stop and Error node downstream of any tool that should hard-fail.

Can I run the agent on open-source models?

Yes. Point the LLM node at an Ollama or vLLM endpoint. Quality drops vs frontier models but cost goes to near zero.

AI16 min readUpdated 2026-06-29

Build Production AI Agents in n8n (2026 Guide)

Design tool-using AI agents in n8n with memory, vector stores, and human-in-the-loop. Patterns that survive real traffic.

Key takeaways

An agent = LLM + tools + memory + (optionally) retrieval.
Always pin a system prompt that names the agent's job and forbids what it must never do.
Use sub-workflow Tool nodes so each tool is independently testable.
Add a human approval step before any irreversible action (send email, charge card, delete record).

n8n is now the shortest path from a blank canvas to a production AI agent that uses tools, remembers conversations, and queries your knowledge base. The Agent node wraps LangChain primitives in a visual graph so you can build, debug, and ship agents without writing an orchestration framework from scratch.

The four building blocks

AI Agent node — the orchestrator. Tool node — anything the agent can call (HTTP, sub-workflow, calculator, vector store). Memory node — Window Buffer for short chats, Postgres-backed for production. Vector Store node — Pinecone, Qdrant, Supabase pgvector, or Weaviate for RAG retrieval.

Pattern 1 — Support triage agent

Webhook trigger from your help-desk receives a new ticket. Agent node with three tools: search_kb (vector store over your docs), get_customer (HTTP to your CRM), draft_reply (sub-workflow that posts to a Slack approval channel). Memory: none — each ticket is independent. Result: 60–80% of tier-1 tickets drafted before a human reads them.

Pattern 2 — Sales research agent

Chat trigger. Agent with tools: search_web (Serper or Tavily), fetch_linkedin (HTTP Request), score_lead (sub-workflow with internal scoring rules), save_to_crm (HubSpot node). Memory: Postgres so the rep can pick up where they left off.

Pattern 3 — Internal data Q&A

Slack trigger on /ask. Vector store retriever tool over your internal wiki. The agent answers with citations. Add a feedback node that logs thumbs-up/down to a Postgres table — your evaluation set writes itself.

Human-in-the-loop is not optional

Any agent that takes external action should pause for human approval the first 90 days. Use the n8n Wait node with a Form trigger or a Slack interactive message. The agent proposes; the human approves; the workflow executes. This single pattern catches 95% of hallucinations before they hit a customer.

Evaluation — the missing discipline

Build a Postgres table of 50 golden test cases. Run a nightly workflow that fires each case through the agent and grades the output with a second LLM. Track accuracy over time. Without this, you cannot tell if your last prompt edit made the agent better or worse.

Frequently asked questions

Which LLM should my n8n agent use?: GPT-4o-class models for orchestration, smaller models for cheap tool calls. Anthropic Claude is excellent for long-context document Q&A.
How do I stop the agent from looping?: Set Max Iterations on the Agent node (default 10). Add a Stop and Error node downstream of any tool that should hard-fail.
Can I run the agent on open-source models?: Yes. Point the LLM node at an Ollama or vLLM endpoint. Quality drops vs frontier models but cost goes to near zero.

← All posts Start the free n8n course →