When to build a custom agent vs. use ChatGPT
Off-the-shelf LLMs like ChatGPT are remarkable — but they hit hard limits in production. After building over forty AI systems for clients, here's the framework we use to decide when to invest in a custom agent.
The three honest reasons to use ChatGPT
ChatGPT (or Claude, or Gemini) is the right answer more often than vendors selling custom AI want to admit. Use the off-the-shelf tool when:
- You're exploring. If you're not sure what you'd automate, don't build a system yet. Use the chat interface for a month and watch what your team actually does with it.
- The task is one-off or low-volume. Writing an email once a week? Drafting a board doc? No system needed.
- The data is general knowledge. If the task doesn't require your private context, the consumer chat works fine.
When you actually need a custom agent
The decision flips when you cross any of these thresholds:
1. The task runs more than 100 times per week
At that volume, copy-pasting into ChatGPT becomes the bottleneck. You need an API integration, structured outputs, and error handling.
2. You need your private data in the loop
Customer history, internal docs, your knowledge base — ChatGPT doesn't have these. RAG (Retrieval-Augmented Generation) is the standard solution, and it's not a feature; it's an architecture.
3. Output quality must be measurable
"Sometimes it gives weird answers" is fine for casual use. In production you need evals, fallbacks, and confidence thresholds — none of which exist in the consumer chat.
A 4-question test
Before committing to a custom build, answer these:
- Does this task happen at least 50 times per week?
- Does it require data ChatGPT doesn't have?
- Does the cost of a bad output exceed $50?
- Do you need an audit trail?
Three "yes" answers? Build the custom agent.
The middle ground people miss
There's a third option teams overlook: a thin custom layer on top of ChatGPT's API. You get the model intelligence without rebuilding it — just add your data retrieval, your prompts, your evals, and your UI. We build this pattern more than anything else.
If you're trying to make this decision for your team, we offer a free AI audit that includes a 45-minute call and a written recommendation. Get in touch.