How we measure Kauzio's accuracy, per sector
How we measure Kauzio's accuracy, per sector
Every Kauzio installation runs a nightly job that scores its own calibration, per sector. We publish success rate, average delta and a 0 to 100 calibration score. Here is what each number means and why we put it on the dashboard.

Jaswant Singh
Co-Founder & CEO, Kauzio
Most AI tools that touch business decisions tell you what they think. Very few tell you how often they were right last quarter.
We think that is the wrong way round. If a system is helping you set prices, sign off hires, or move a credit threshold, the first question a serious buyer asks is simple. How often does it land where it said it would land?
So every Kauzio installation runs a nightly calibration job. It does not look at flattering averages. It compares every closed decision against the outcome that actually happened, sliced by sector, and writes the result to a page you can open at any time.
What gets measured
Three numbers, computed per sector, refreshed every night.
The first is success rate. Of all the decisions Kauzio scored in this sector over the last 90 days, how many landed inside the expected range. Not close. Inside. A pricing call that predicted a 4 to 7 percent lift and delivered 5.2 percent counts as a success. One that predicted the same range and delivered 1.1 percent does not.
The second is average delta. For every decision in the window, we take the gap between predicted outcome and actual outcome, expressed as a percentage of the predicted value. Then we average the absolute gaps. A small number means the predictions sit close to reality. A large number means they drift.
The third is the calibration score, on a 0 to 100 scale. This combines success rate, average delta and sample size into one figure. It is the one number a non-technical buyer can look at and trust. Above 80 is strong. 60 to 80 is workable. Below 60 means the model is not yet ready to lead in that sector and Kauzio will flag it on the decision screen.
An example, framed as illustrative
These numbers are not a traction claim. They are the shape of what the page shows once enough decisions have closed in each sector.
| Sector | Decisions closed | Success rate | Average delta | Calibration | |---------------------|------------------|--------------|---------------|-------------| | Retail pricing | 312 | 84 percent | 3.1 percent | 87 | | Hospitality staffing| 198 | 79 percent | 4.6 percent | 81 | | Clinic hiring | 64 | 72 percent | 5.9 percent | 74 | | Fintech credit policy| 41 | 68 percent | 7.4 percent | 66 | | Professional services| 22 | n/a | n/a | pending |
Two things to read out of a table like this. First, sectors with more closed decisions tend to settle into a tighter delta, because the model has more outcomes to learn from. Second, a sector with too few decisions is marked pending and Kauzio will say so on the decision screen, rather than pretend to a confidence it has not earned.
Why publishing calibration matters
Three reasons, all practical.
It forces honesty. The moment a vendor commits to publishing a score that updates nightly, the incentive to overclaim disappears. You cannot quietly walk back a number that is on a public page.
It gives buyers something to argue with. A claim like "our AI is accurate" cannot be tested. A claim like "retail pricing calibration is 87, hospitality staffing is 81, fintech credit policy is 66" can. A buyer can ask why fintech is lower, what the failure modes look like, and whether their use case sits in the strong band or the weaker one.
It makes the system safer to deploy in regulated settings. A compliance team can point to the calibration page in the audit file. The score is signed, dated and reproducible. When a regulator asks how the business knew the model was fit for use, the answer is not a sales deck. It is a number that updated last night.
What the score does not say
A high calibration score does not mean every decision will land. It means the band Kauzio quoted was honest in aggregate. A single decision can still miss. The score tells you how often the bands hold, not that the next one is guaranteed.
It also does not say the model is equally strong across every sub-segment of a sector. Retail pricing at 87 might be 91 on apparel and 78 on grocery. That is why the dashboard lets you drill in. The top-line number is the headline, not the whole story.
Where to find it
Inside the dashboard, the calibration page sits under Trust. It shows the live score per sector, the 90-day trend line, the count of closed decisions feeding the score, and a download link for the raw outcome data so a finance or compliance team can reproduce the math.
If you want AI in your business decisions, the floor is not how clever the model sounds. The floor is whether the people building it are willing to publish how often it was wrong. That is what the calibration page is for, and that is why it updates every night without anyone having to ask.
Read next
Why every Kauzio decision gets a signed, public certificate that updates with the outcome
Why every Kauzio decision gets a signed, public certificate that updates with the outcome
May 20, 2026 · 4 min read
The six ways we attack every Kauzio decision
The six ways we attack every Kauzio decision
May 20, 2026 · 4 min read
The 5 business decisions that quietly cost the most
The 5 business decisions that quietly cost the most
May 18, 2026 · 3 min read
