AI-Powered Anomaly Detection Caught a $40K Billing Error in Week One

A finance team deployed anomaly detection on their Odoo Accounting data. In the first week, it flagged a duplicate vendor bill for $40,000 that had already passed manual review twice.

Key Takeaways: Anomaly detection on ERP billing data is not about replacing human review — it’s about directing attention to where human judgment actually matters. A duplicate vendor bill with a slightly different reference is nearly invisible at volume; a pattern-matching model catches it in milliseconds. The first catch is gratifying. What changes afterward is more significant: the finance team stops managing their data defensively and starts trusting it. False positives are part of the calibration process, not a sign the system isn’t working.

The Invoice Volume Problem Nobody Talks About

A regional distribution company in Ho Chi Minh City was processing roughly 350 vendor invoices per month. Not a staggering number. Their finance team of four handled it alongside month-end close, customer billing, bank reconciliation, and the usual fire-fighting.

The AP process was manual: invoices arrived by email, got entered into Odoo, matched to purchase orders where possible, and queued for approval. Each account.move record went through at least two sets of eyes before payment. The team was diligent. They were not, by any reasonable measure, careless.

And yet a $40,000 duplicate vendor bill cleared their review process. Twice.

What the Anomaly Detector Monitors

The anomaly detection setup runs as a scheduled job against account.move and account.move.line records in Odoo Accounting. It’s not a single algorithm — it’s a set of pattern checks, each tuned to a specific class of billing error.

The three monitors that matter most in practice:

Duplicate bill detection. Compares incoming vendor bills against historical records using a combination of vendor ID, invoice total, invoice date window (±14 days), and reference string similarity. The last piece is what most manual duplicate checks miss: reference string similarity rather than exact match. A vendor who sends the same invoice twice but changes INV-2025-0881 to INV-2025-0881-R will sail through a pure equality check. The model catches it.

Unusual amount detection. For each vendor, the model builds a baseline from the last 12 months of account.move.line entries — amounts by product category, typical ranges, seasonal patterns. Bills that fall outside two standard deviations from baseline get flagged, not rejected. A new vendor with no history doesn’t trigger this check; the system requires at least six prior invoices before it forms expectations.

Recurring invoice pattern breaks. Some vendors bill predictably: same amount, same day, every month. When that pattern breaks — the amount changes by more than 15%, the invoice arrives late, or a monthly vendor suddenly sends two invoices in one week — the system flags it for review. This catches both honest errors and cases where a vendor relationship changed without the finance team being told.

Each flag generates an entry in a custom Odoo dashboard widget. Nothing gets auto-rejected. The team reviews each flag and decides what to do.

The Detection: Week One

Seven days after go-live, the duplicate detection model flagged an account.move record from a freight logistics vendor. The invoice total was 920 million VND — roughly $40,000 USD at the time. The reference number was FLV-2025-1104-A.

The model had matched it against an invoice posted three weeks earlier: FLV-2025-1104, same vendor, same amount, same line items. The -A suffix was the only difference.

The finance team pulled both records. The invoices were identical in every meaningful field — vendor, total, currency, line items, bank account. The vendor had apparently reissued the invoice with a corrected reference (the original had a formatting issue on their end) without canceling the original. Both had been entered into Odoo. Both had passed the two-person review.

The duplicate hadn’t been paid yet — it was queued for the next payment run. The team removed it, contacted the vendor to confirm, and closed the case. Forty minutes of work to recover $40,000.

Why Manual Review Missed It

This is the part worth understanding, because the answer is not “the team wasn’t careful enough.”

At 350 invoices per month, the AP reviewers are processing roughly 17 invoices per working day. Each one requires matching against a PO, checking line items, and confirming vendor details. The cognitive load is real. By invoice 15 of a given day, pattern detection degrades.

More specifically: the two invoices arrived three weeks apart. The reviewer who approved the second one was checking it against the PO and the vendor’s bank details — both of which matched. They weren’t running a mental search across three weeks of transaction history looking for a reference string that differed by two characters. That’s not a process failure. It’s just human.

The -A suffix made it worse. Reference strings with letter suffixes are common for legitimate reissues, credit notes, and corrected invoices. A reviewer who noticed the similarity could reasonably have assumed it was intentional. The model doesn’t assume — it flags and asks.

There’s also a visual similarity problem that doesn’t get discussed enough. Two vendor invoices from the same supplier look nearly identical: same logo, same format, same table structure. The eyes skip to the reference number and the total. If both look plausible, approval happens fast. The model ignores the visual layer entirely and works from structured data in account.move — which is why it catches what visual review misses.

What Changed After the First Month

The $40,000 catch got attention internally. But the more lasting change was subtler.

The finance team stopped auditing historical invoices defensively. Before deployment, there was a background anxiety about what might have slipped through — an overpayment, a missed credit note, a duplicate from six months ago that nobody had noticed. After the anomaly detector had been running for 30 days with a low false-positive rate (three flags raised, two genuine issues found, one resolved as intentional and expected), the team’s relationship with their data shifted.

They started using the weekly “clean run” reports — invoice volume processed, flags raised, flags resolved — in their monthly review with the CFO. Not as a compliance artifact. As a signal that the AP process was under control.

The unusual amount detection caught one more issue in month two: a vendor had billed for a line item at a price 40% above their contracted rate. Not fraud — a pricing table error on the vendor’s side. The AP team caught it before payment and received a corrected invoice within 24 hours.

The recurring pattern monitor has generated mostly noise so far, which is not unusual. It takes time to tune thresholds to a specific vendor mix. One legitimate flag did come through: a monthly SaaS subscription that doubled in price without any advance notice to the team. Worth catching.

Limits Worth Naming

Anomaly detection catches deviations from known patterns. It does not catch what it has no baseline for.

A first-time fraudulent vendor looks identical to a legitimate new vendor until enough history accumulates. A well-constructed invoice for services never rendered won’t trigger a duplicate flag if the reference number is unique and the amount is plausible. The model is not a fraud detection system — it’s a billing pattern monitor.

The setup is also only as good as the data in account.move. Invoices processed outside Odoo — paid by corporate card, entered late, or created in a parallel system — are invisible to it. Incomplete AP centralization means incomplete coverage, and it’s worth being honest about where those gaps are before deploying.

False positive rate matters more than it sounds. If the system flags 20 invoices a week and 18 turn out to be fine, the team will stop trusting the flags within a month. Threshold calibration takes time and requires someone willing to review the early false positives carefully rather than dismissing them as noise. That calibration work is part of the project, not an afterthought.

None of these are reasons to skip it. They’re reasons to go in with clear expectations.

Key Takeaways

Duplicate detection based on reference string similarity catches errors that exact-match logic misses — the letter-suffix problem (INV-0881 vs INV-0881-A) is more common than most AP teams realize.
Manual review at volume degrades predictably. Human working memory doesn’t hold three weeks of invoice history while processing invoice 15 of the day. This is expected behavior, not negligence.
The financial return from a single catch can cover the entire implementation cost. The lasting value is the shift in how the finance team relates to their own data.
Anomaly detection directs human judgment to the cases that need it — it doesn’t replace the judgment.
Budget time for calibration. The first 30–60 days will surface false positives that help tune the thresholds. That’s the process working correctly.

At Trobz, we’ve deployed this kind of anomaly monitoring on top of existing Odoo Accounting installations — no module replacement, no data migration. If you’re running 200+ vendor invoices a month and wondering what might be in there, reach out and we can walk through what the setup looks like for your vendor mix.