
Most conversations about AI start with the technology. The ones that actually save money start somewhere far less exciting: a list of the boring, repetitive tasks your team does every week.
The promise of AI automation is simple, hand the dull, repetitive work to software so your people can focus on judgment, relationships and growth. But the gap between a slick demo and a tool that quietly saves a day a week is wide. This article lays out the framework we use to decide what is genuinely worth automating.
Start with the task, not the technology
The highest-ROI automation almost never comes from the most advanced model. It comes from picking the right task. The best candidates share a few traits: they are repetitive, high-volume, rule-based or language-heavy, and they are currently done by people who would rather be doing something else.
If a task is repetitive, frequent, and follows patterns, it is a candidate. If it also makes people miserable, it is a priority.
Estimate the payback in three numbers
You do not need a data science team to size the opportunity. You need three numbers:
The ROI back-of-envelope
- Time per task, how many minutes each instance takes today.
- Frequency, how many times it happens per week or month.
- Loaded cost, the fully-loaded hourly cost of the people doing it.
Multiply them out and you have the annual cost of the task. That figure is the ceiling on what automating it can save, and a sober anchor for any build decision. If the number is small, move on. If it is large and the task is well-defined, you have found a project.
Pick one task, prove it, then expand
The most common mistake is trying to automate everything at once. We start with a single, painful, well-scoped task, invoice processing, support drafting, lead triage, and ship it in weeks. A working result on one task builds trust, produces real numbers, and reveals the next opportunity far better than any roadmap.
What "done" looks like
Done is not "the model works in a demo." Done is the automation running in production, integrated with your existing tools, with a human reviewing the edge cases and an audit trail of every decision. Accuracy is measured against a real test set before launch, and monitored after.
Where to go from here
If you can name one task that eats hours of your team’s week, you already have a candidate. The next step is to size it honestly and prototype the smallest version that proves the payback. That is exactly what our two-week Discovery Sprint is built to do.


