Article

AI Tools vs. Custom AI: When to Choose Which (the 80/40 Line)

By Zac RobertsMay 18, 20268 min read

Every week a service-business owner tells us the same thing: "I do not know if I should buy a tool or build something." The honest answer is that the decision is not nearly as hard as the AI vendor landscape makes it look. There is a single line that decides it, and you can find that line in an afternoon.

This article is the framework we use, the questions we ask before we recommend either path, and two recent engagements where the answer went different directions for reasons that should be obvious in hindsight.

The 80/40 line

Start with the workflow, never the tool. You should already have one workflow you can describe in two sentences. If you do not, go read the previous article in this series first; the rest of this one will not help you.

With the workflow in hand, the question becomes: how much of it does an existing off-the-shelf AI product cover? Not "could it cover" with enough configuration. How much does it cover out of the box, on a 60-minute evaluation, against your real data.

Why the 80 line matters

At 80 percent or more, the remaining 20 percent is usually documentation, training, and a couple of small process tweaks on the operator side. Those are not engineering problems. They are change-management problems that you can solve internally for the cost of a few hours of partner-and-admin time.

Below 80 percent, the gap is almost always an integration: the tool does not write back to your CRM, does not read your file naming, does not respect the user roles your existing software cares about. Filling those gaps inside a vendor product is expensive and brittle. You end up paying the vendor for software you mostly ignore and paying a second time to glue it to your stack.

Why the 40 line matters

At 40 percent or less, you are buying a partial solution and replicating most of the workflow elsewhere anyway. The vendor markup, the integration tax, and the maintenance of two parallel systems combine into a cost that is almost always higher than a small custom build sized to your exact workflow.

A small custom build for a single workflow, for a 5-to-50-person service business, is not the six-figure project AI agencies pitch. It is two to four weeks of work on top of your existing software, written as a thin service that calls a foundation model and a handful of your own APIs. The build cost is recouped fast because the workflow it replaces was costing partner time, not platform fees.

How to actually evaluate against the 80/40 line

A 60-minute evaluation on your own data tells you almost everything. The vendor will offer a polished demo. Skip it. Ask three questions instead.

Question 1. Can the demo run on my data, in this call, right now?

If the answer is no, the integration with your stack is harder than the sales motion is telling you. The "yes, with a bit of setup" answer usually means a two-to-four-week implementation engagement on top of the license, and that engagement cost is almost never in the quote.

Question 2. Does the tool write its output back into the system where my team already works?

A tool that only displays output in its own dashboard is a second system to log into, not an automation. For a small service business, every additional dashboard is a tax on operator attention. If the answer is "we have a Zapier connector" — that is fine for read-only, less fine when the integration needs to write back to a record the operator is actively editing.

Question 3. What is the failure mode when the AI gets it wrong?

A good tool surfaces low-confidence outputs to a human, makes the override one click, and learns from the override. A bad tool quietly substitutes a wrong answer with high confidence and produces no audit trail. Ask the vendor to show you, on your data, what happens on a deliberately confusing input. Watch what the operator has to do to recover.

Two engagements, two different answers

Engagement 1: A CPA firm. Custom won.

Four-partner firm, new-client intake taking 90 minutes per client. We evaluated three off-the-shelf intake products. All three read PDFs. None wrote structured fields back into the firm's practice-management system, which had an API but was not on any vendor's shortlist. The off-the-shelf coverage landed at roughly 35 percent of the workflow.

Below the 40 line, so we built. A hosted intake form, an LLM-driven extractor against prior-year returns and EIN letters, a write-back to the practice-management API, and the engagement letter generated by populating the firm's existing Word template. The partner still signs every letter. Intake-to-engagement-letter time dropped from 90 minutes to 15 minutes, median, across 47 intakes in the first 30 days. Build cost recouped in week 4 on the firm's own internal cost-per-hour math.

Engagement 2: An HVAC company. Tool plus thin custom layer won.

Fourteen-truck residential HVAC, dispatch triage. The existing field-service software covered the scheduling, the customer record, and the technician routing. It did not classify inbound severity. Coverage of the underlying dispatch workflow was about 85 percent.

At 85 percent we did not rip out the field-service software. We built a thin layer in front of the dispatcher that normalized inbound from phone, web form, and the third-party booking widget, ran an LLM classifier on severity / system / customer-stated preference, and wrote the suggestion into the dispatcher's existing queue view. She accepts, overrides, or escalates. Same-day call closure went from 41 percent to 78 percent across 30 days and 612 inbound service requests. Dispatcher triage time dropped from 4 minutes to 45 seconds per call.

What about the 40-to-80 middle ground?

This is the only case where the decision is actually hard, and it is harder than the vendor or the agency will admit. In the 40-to-80 range you are weighing the integration tax of bending a tool against the maintenance tax of owning a custom build.

Three questions to break the tie:

Will the vendor build the missing 20 percent on their roadmap in the next 6 months? Ask in writing. If yes, buy and wait. If vague, treat as no.
Does the custom build need to talk to a system that changes often? If your CRM is on a stable API, custom is fine. If your practice-management vendor reissues their API every nine months, the tool is safer.
How operator-critical is the workflow? If it touches money or legal commitments, custom gives you control over the audit trail. If it touches internal productivity, the tool is usually fine.

Where to start this week

Write down one workflow in two sentences. Pick the one that costs the most partner or owner time per week. Block a 60-minute evaluation with one off-the-shelf vendor on your real data and ask the three questions above. The result will land on one side of the 80/40 line, and the decision will make itself.

If you want a second set of eyes on which side of the line you fell on, the Sensara consultation is free. We will tell you which side honestly, including the side that means we do not have an engagement to sell you.