Key Takeaways: Most PoC failures trace back to misaligned expectations, not modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… quality. A one-page brief forces the hard conversations before any code is written. The brief has seven sections: problem statement, baseline, hypothesis, data requirements, success threshold, failure condition, and timeline. Each section is a question that, left unanswered, will derail the project. The template at the bottom of this post is the format we use.
Every PoC starts with enthusiasm. Two weeks in, the demo looks good. Six weeks in, the stakeholder asks, “But is this actually better than what we have now?” — and nobody knows, because nobody measured what they had before.
That’s the PoC graveyard. Not bad models. Not bad engineers. Bad setup.
The fix is not a longer kickoff meeting. It’s a single document, written before the first line of code, that forces everyone to agree on the same four things: what problem we’re solving, how we’re measuring it today, what success looks like, and when we stop.
We call it the PoC brief. It fits on one page. It takes two hours to write and two days to agree on — and that’s exactly the point. The two days of friction up front are worth it.
The Seven Sections
1. Problem Statement
One sentence. No adjectives. No phrases like “suboptimal” or “challenging.” Just the mechanism of failure.
Accounts payable manually re-keys invoice data from PDF attachments into Odoo
account.moverecords, taking an average of 4 minutes per invoice at a volume of 200 invoices per day.
That sentence is specific enough to design a solution against. “We need to automate invoice processing to improve efficiency” is not.
The test: could someone build a completely wrong solution that still satisfies your problem statement? If yes, the statement is too vague.
2. Current Baseline
Before you claim the modelA mathematical function trained on data that maps inputs to outputs. In ML, a model is the artifact produced after training — it encapsulates learned patterns and is used to make predictions or… is better, you need to know what “better” means in numbers. This section captures the current state — measured, not estimated.
| Metric | Current value | Source |
|---|---|---|
| Manual processing time per invoice | 4.2 min | AP team time log, 2 weeks |
| Error rate (field-level mismatches vs. source PDF) | 3.1% | Audit sample, 50 invoices |
| Daily volume | 180–220 invoices | account.move creation timestamps, Q1 |
If you don’t have these numbers, getting them is part of the PoC — which changes the timeline. Don’t pretend you have a baseline when you’re estimating.
3. Hypothesis
The hypothesis is an if-then statement with a number attached.
If we build a document AI pipeline that extracts structured fields from incoming vendor PDFs and auto-populates
account.movedraft records, we expect to reduce manual entry time by 70% and field-level error rate to below 1%, measurable on a 200-invoice test set.
Every word here is doing work. “70%” is a target, not a vague improvement. “200-invoice test set” is a measurement plan, not a hope. “Field-level error rate” is a definition — not “accuracy” or “quality,” which mean different things to different people.
If you can’t write the hypothesis with a number and a measurement method, the PoC isn’t ready to start.
4. Data Requirements
This section kills more PoCs than any other. Write down exactly what you need, who owns it, and how you’ll get access.
| Data | Owner | Format | Access method | Availability confirmed |
|---|---|---|---|---|
| Historical vendor PDFs (6 months) | AP team | Files in /shared/invoices/ |
Direct read access | ✅ |
| Ground-truth extracted fields | AP team | Manual export from Odoo | CSV via account.move.line query |
❌ — needs manual annotation |
| Vendor master (for supplier name matching) | Procurement | Odoo res.partner |
JSON-RPC API | ✅ |
The “Availability confirmed” column is the one to watch. “We can probably get it” is not confirmed. If critical data is unconfirmed at kickoff, note that explicitly and flag it as a blocker.
5. Success Threshold
This is the number at which you say, “Build it.”
If the pipeline achieves ≥70% reduction in manual processing time and ≤1% field-level error rate on the 200-invoice test set, we recommend proceeding to production scoping.
The threshold has to be set before you see the results. If you set it after, you’ll move it. Humans are very good at finding reasons why 58% is close enough when they’ve already built the thing.
Also worth deciding: who has authority to sign off on the threshold? If it’s the CFO, they need to agree to this number now, not after the PoC when they’re being asked to fund the next phase.
6. Failure Condition
This is the number at which you say, “Stop.”
If the pipeline achieves <50% reduction in processing time or >2% field-level error rate on the test set, we do not recommend proceeding. We will document the failure modes and reassess whether a different approach is warranted.
Most PoC briefs don’t have this section. That’s why PoCs rarely get cancelled — because there was never a defined condition under which they should be. Instead, they get extended, rescoped, and quietly buried.
Defining the failure condition is not pessimism. It’s what makes the success threshold meaningful.
7. Timeline
| Milestone | Date |
|---|---|
| Data access confirmed | Day 2 |
| Initial pipeline running on sample invoices | Day 5 |
| Test set evaluation complete | Day 10 |
| Results presentation to stakeholders | Day 12 |
| Go/no-go decision | Day 14 |
Two weeks is our default. It’s short enough to maintain focus and long enough to surface real failure modes. If someone says the PoC needs six weeks to do properly, that’s usually a scoping problem — the PoC is too broad.
A Filled-Out Example: Invoice Classification PoC
Here’s the full brief, completed, for an invoice classification PoC at a distribution company.
Problem statement: The AP team manually categorizes 200+ vendor invoices per day into one of 14 expense categories before posting to account.move. Mis-categorization rate is 6.2%, discovered at month-end reconciliation.
Baseline: 2.1 minutes per invoice for categorization (measured over 3 weeks). 6.2% mis-categorization rate (audit of 500 invoices, Q4). Monthly reconciliation corrections: average 12 per month.
Hypothesis: If we build a classifier that assigns expense category from vendor name, line item descriptions, and amount, we expect to reduce mis-categorization to below 2% on a held-out test set of 300 invoices, and reduce per-invoice processing time by 60%.
Data requirements: 12 months of account.move records with vendor, line items, and posted category (confirmed via Odoo query); vendor master from res.partner (confirmed); a sample of 50 mis-categorized invoices with correct category annotated by AP team (not yet confirmed — AP manager to provide by Day 2).
Success threshold: ≤2% mis-categorization rate and ≥60% time reduction on test set. CFO sign-off required.
Failure condition: >4% mis-categorization rate or <40% time reduction. If either condition is met, we recommend a rules-based pre-classification approach instead and present that option at Day 12.
Timeline: 14 days. Data confirmed by Day 2, classifier running by Day 6, test set evaluation Day 10, stakeholder presentation Day 12.
The brief took 90 minutes to draft and one meeting to agree on. The AP manager pushed back on the 14-day timeline (she wanted 3 weeks “to be safe”). The CFO pushed back on the success threshold (he wanted ≤1% mis-categorization, not ≤2%). Both conversations happened before the PoC started — which is exactly when they should happen.
The Template (Copy It)
## PoC Brief: [Project Name]
**Problem statement:** [One sentence. What fails, how often, at what cost.]
**Current baseline:**
| Metric | Current value | Source |
|--------|--------------|--------|
| [metric 1] | [value] | [source] |
| [metric 2] | [value] | [source] |
**Hypothesis:** If we build [X], we expect [Y], measured by [Z] on [test set].
**Data requirements:**
| Data | Owner | Format | Access | Confirmed |
|------|-------|--------|--------|-----------|
| [data 1] | [owner] | [format] | [method] | ✅/❌ |
**Success threshold:** [The number at which we say "build it." Who signs off.]
**Failure condition:** [The number at which we say "stop." What happens next.]
**Timeline:**
| Milestone | Day |
|-----------|-----|
| Data confirmed | |
| Working prototype | |
| Evaluation complete | |
| Stakeholder presentation | |
| Go/no-go decision | |
Why Most PoCs Skip This
The honest answer: the brief forces a conversation that nobody wants to have before the project starts. The stakeholder doesn’t want to commit to a number they might not hit. The engineer doesn’t want to be held to a timeline they haven’t scoped. The project manager doesn’t want to slow down momentum.
So instead, everyone agrees to start, and the hard questions get answered — under pressure, with partial information — six weeks later.
The brief doesn’t make the PoC easier. It makes the PoC honest. That’s not the same thing, and it’s worth the friction.
For more on what happens when the brief isn’t written — and what a well-run two-week sprint looks like in practice — see From Slide Deck to Running System in 14 Days.
At Trobz, the brief above is the first document we write on every AI engagement — before we touch the data, before we set up the environment. If you’re starting a PoC and want to pressure-test your scoping, we’re happy to take a look.