Skip to main content

How to interpret results

Aucert produces structured test results with confidence scores, severity classifications, and visual evidence. Here's how to read them.

Confidence scores

Every finding includes a confidence score between 0.0 and 1.0. This represents how certain the AI model is about its observation.

Score rangeMeaningTypical action
0.95–1.0Very high confidenceAlmost certainly a real bug — fix it
0.85–0.95High confidenceLikely a real bug — quick manual check recommended
0.70–0.85Medium confidencePossible issue — manual investigation needed
Below 0.70Low confidenceMay be a false positive — verify before acting
tip

Set your confidence_threshold in aucert.config.yaml to match your team's tolerance. Start at 0.85 (the default) and adjust based on your false positive rate. See configure project for threshold tuning guidance.

What affects confidence?

FactorEffect on confidence
Clear pass/fail signalHigh — login screen shows home screen vs error
Ambiguous UI stateLower — loading spinner that may or may not be transitioning
Animation in progressLower — screenshot captured mid-transition
Complex visual comparisonLower — subtle layout shift or color difference

Severity levels

Bug reports are classified into four severity levels:

SeverityDescriptionExamples
CriticalApp crash, data loss, security vulnerabilityANR, uncaught exception, data corruption
HighMajor feature broken, user flow blockedLogin fails, checkout stuck, navigation dead-end
MediumUI issue, minor functional problemWrong text displayed, layout shift, slow transition
LowCosmetic issue, minor UX concernTruncated label, off-brand color, minor alignment

Bug report structure

Each bug report includes:

BUG-001: Checkout loading spinner does not resolve
──────────────────────────────────────────────────
Severity: High
Confidence: 71.2%
Test scenario: Cart → Checkout → Payment

Reproduction steps:
1. Add item to cart
2. Tap "Checkout" button
3. Wait for payment form

Expected: Payment form displayed within 3 seconds
Actual: Loading spinner persisted after 5 second timeout

Screenshots:
Step 2: [checkout-button-tap.png]
Step 3: [loading-spinner-stuck.png]

Device context:
Emulator: Pixel 7 API 34
OS: Android 14
Screen: 1080x2400 (420 dpi)

Key fields explained

FieldPurpose
Reproduction stepsExact actions to trigger the issue — useful for developers
Expected vs actualWhat the AI expected to see vs what it observed
ScreenshotsVisual evidence at each step — the primary diagnostic tool
Device contextEmulator configuration so you can reproduce locally

JSON output format

When using --output json, results are structured for programmatic consumption:

{
"run_id": "a1b2c3d4-...",
"scenarios": [
{
"name": "Cart → Checkout → Payment",
"result": "fail",
"confidence": 0.712,
"bug_report": {
"title": "Checkout loading spinner does not resolve",
"severity": "high",
"steps": ["Add item to cart", "Tap Checkout", "Wait for payment form"],
"expected": "Payment form displayed within 3 seconds",
"actual": "Loading spinner persisted after 5 second timeout",
"screenshots": ["step-2.png", "step-3.png"]
}
}
]
}

What's next