How we score — the open documentation

Every adjustment, every threshold, every source.

The 0–100 score on every Verdict is the output of a deterministic 13-stage pipeline with named, documented constants. This page is the open documentation of every rule the engine applies, with the numeric value, the rule it enforces, the basis behind it, and the source file you can read directly. No black box, no calibration hidden behind marketing language. If an AI scrutinizing your Verdict asks “where did this number come from?” — the answer is here.

Step one

Component baselines and their weights.

Every Verdict starts from a per-component baseline score. Findings adjust the component score down (or, for closed recall replacements, up). The eight component scores are weighted to a single overall score:

Component	Weight
Powertrain	25%
Drivetrain	15%
Body & Frame	15%
Brakes	10%
Suspension & Steering	10%
Electrical	10%
Emissions & Fuel System	8%
HVAC & Comfort	7%

Step two

The score-to-band mapping.

The seven bands name the procedural state of the score — not a judgment on the vehicle. The bottom two were renamed 2026-05-22 from “Significant Concerns” / “Major Concerns” to “Repair Window” / “Replacement Window” so the label describes the math of the score's position on the repair-to-value curve. Same thresholds, same engine math.

Score	Band	What this procedurally describes
93–100	Strong	Pristine condition; zero non-trivial findings. The audit trail places this vehicle inside the engine's highest tier.
85–92	Healthy	Solid shape with minor wear items only — maintenance, not concern.
76–84	Sound	Good condition, ordinary maintenance items present, nothing urgent.
68–75	Watching	Real items to monitor. Nothing failing today, but plan a service window in the next 6 months.
58–67	Needs Attention	Active concerns to address soon. Meaningful work pending.
45–57	Repair Window	The repair-to-value math is approaching a flip. Multiple active concerns or one severe finding. The repair conversation is real.
0–44	Replacement Window	The math has flipped against repair. Substantial active problems, branded title, or end-of-life territory. Replacement, sell-as-is, or strategic exit is on the table.

Step three

Every adjustment the engine can apply.

Ten named scoring rules. Each is gated by a feature flag and documented inline in the source file. Each rule has an empirical basis and the engine's audit trail records every time the rule fires.

HEALTHY_CAR_FLOOR_LIFT
baseline_floor 82 · perfect 88–95 · 1-notable 84–87 · multi-notable 80–83
The rule
A vehicle with zero Major findings, documented service, and no manifested failures has a floor on how low its score can land. Without this rule, a small accumulation of Notable items could push a clearly-healthy vehicle below 82, which misrepresents the platform.
Basis
Calibrated against the 478-verdict deep validation run. The floor catches "documented + zero majors + no manifested failures" baseline cases that should never land in the Watching range. Perfect ceiling lifted from 92 → 95 on 2026-05-22 so a truly flawless vehicle can reach the sanity-band hard ceiling.
apps/web/lib/verdict-v3/scoring-constants.ts §3.1
PLATFORM_TSB_DEWEIGHT
0 impact on managed · 1.0× multiplier on unmanaged
The rule
A platform-wide TSB (Technical Service Bulletin) the shop is already managing — oil-change interval adjustment, recall remedy applied, software update — contributes zero score impact. Only unaddressed symptoms on the same TSB family carry weight.
Basis
A TSB the customer has already complied with is by definition resolved. Penalizing a vehicle for a TSB the dealer has applied would punish good service history. The deweight makes the score reflect the vehicle's current state, not its platform history.
apps/web/lib/verdict-v3/scoring-constants.ts §3.2
COMPLETED_RECALL_LIFT
+4 points for closed major-class recall
The rule
A closed Major-class recall (battery / drivetrain / structural) materially changes the vehicle's reliability profile. The +4 lift reflects that the affected component is effectively newer than the chassis age suggests.
Basis
A 2017 Bolt EV with its LG battery pack replaced under NHTSA 21V-560 is running on a battery younger than the car. Treating it identically to an un-replaced 2017 Bolt would understate the actual condition.
apps/web/lib/verdict-v3/scoring-constants.ts §3.3
COVERAGE_HUNT_LIFT
+6 max (normal) · +2 max (tight) · 0 (no overlap)
The rule
When a coverage program — TSB, warranty extension, class-action settlement — matches the customer's vehicle AND overlaps with at least one quoted line item, the engine scores against NET out-of-pocket. Tightness (≤12 months OR ≤10% of mileage cap remaining) caps the lift at +2.
Basis
Customers who can claim a covered repair effectively face a lower cost-of-ownership profile than the gross repair total suggests. The math should reflect what the customer actually pays.
apps/web/lib/verdict-v3/scoring-constants.ts §3.4
NOTABLE_STACKING_DAMPING
1st 100% · 2nd 60% · 3rd 35% · 4th+ 20%
The rule
Multiple Notable-severity findings don't stack at full weight. The marginal information from findings 4, 5, 6 is diminishing — a customer with one Notable is meaningfully different from a customer with two; two from three; but findings beyond three are pattern-confirming, not additive.
Basis
Without damping, a vehicle with six minor wear items could score lower than a vehicle with two manifested Major failures, which misrepresents structural risk. Major and Critical findings remain at full weight always.
apps/web/lib/verdict-v3/scoring-constants.ts §3.5
MODERATE_CONFIDENCE_DOUBLE_PENALTY_FIX
Confidence affects display only, not score
The rule
A finding's confidence band controls how the prose describes it ("shudder pattern is consistent with..." vs "confirmed shudder pattern...") — not how much it deducts. Confidence and severity are independent.
Basis
Pre-fix, a Moderate-confidence finding got both a confidence-display hedge AND a confidence-multiplied deduction, double-counting the uncertainty. Now the deduction is uncoupled from confidence — the deduction reflects severity, the prose reflects confidence.
apps/web/lib/verdict-v3/scoring-constants.ts §3.6
END_OF_WINDOW_PENALTY
−3 points when a covered program is within 12 months or 10% of mileage cap
The rule
When a coverage program just paid out and the vehicle is within 12 months OR 10% of mileage of the cap, the customer is about to lose that protection. A small penalty reflects the elevated near-term risk.
Basis
A vehicle 8,000 miles from a coverage cap is in a different position than a vehicle 80,000 miles from it. The penalty pairs with COVERAGE_HUNT_LIFT — tightness caps the lift at +2 so the net is still positive, just smaller.
apps/web/lib/verdict-v3/scoring-constants.ts §3.7
PLATFORM_FAILURE_WEIGHT
Full weight retained when manifested, regardless of coverage
The rule
A vehicle currently experiencing a documented platform failure carries the full weight of that failure even when a coverage program covers the dollar cost. Coverage solves the dollar problem; it doesn't make the vehicle not-broken.
Basis
Pre-rule, a covered CVT failure could net out to zero score impact because COVERAGE_HUNT_LIFT offset the deduction. But the customer still has a CVT failure — that's a real platform signal the score should reflect.
apps/web/lib/verdict-v3/scoring-constants.ts §3.8
SCORE_SANITY_BANDS
Hard ceiling 95 · soft warn at 35 · hard floor 30
The rule
No Verdict may exceed 95 without the sanity_override flag in the audit trail. No Verdict may drop below 30 without total_loss_eligible. Scores between 30 and 35 receive a soft-warn note instructing the prose to acknowledge the marginal position.
Basis
Safety net against any edge case the calibration didn't anticipate. The hard ceiling preserves a 5-point buffer below a perfect 100 to acknowledge that no vehicle is literally perfect; the hard floor preserves a 30-point band of bottom-tier scores before total-loss territory.
apps/web/lib/verdict-v3/scoring-constants.ts §3.9
CRITICAL_COMPONENT_DRAG_RULE
Base score capped at (min_component + 30) when any component ≤ 35
The rule
When any single component scores at or below 35 (Replacement-Window territory), the weighted-average base score is capped at the minimum component score + 30 points. A failing HV battery at 22 caps overall at 52, not 73.
Basis
Cal-West round-4 validation found that a 2015 Leaf with a 53%-SoH HV battery scored 73 because healthy peripherals weighted-averaged past the dominant concern. That misleads the customer — the dominant concern is the dominant concern.
apps/web/lib/verdict-v3/stages/stage-10-aggregation.ts (Cal-West v3.8.3 rule)

Step four

What runs before the Verdict ships.

Post-compose deterministic validators that catch anything the LLM-driven audit pass might miss. Every customer-facing Verdict runs through this stack before it's delivered.

output-validator.ts

Six deterministic checks: unverified NHTSA campaign IDs (redacted), TSB / Bulletin IDs (redacted), fabricated quoted speech (redacted), internal classification leaks (stripped), cross-engine pattern borrowing (flagged), implausible dollar figures (flagged). Added 2026-05-22: score-band-prose coherence check — if the score lands in the Repair Window or Replacement Window, the prose must contain at least one structural marker (“repair-to-value”, “replace”, “total-loss”, “walk away”, etc.) or the Verdict is flagged for pre-delivery review.

Step five

What the cost ranges mean — and what they don't.

Every cost range in a Verdict comes from one of two sources. Both are sourced — neither is invented by the language model.

Source 1

The 733-pattern failure-mode database (failure-modes.yml). Every entry carries a repair_cost_range_usd field — a low/high pair calibrated to platform-typical shop pricing in the US. These ranges reflect what independent shops and dealers commonly charge for the named repair on the named platform, NOT a quote from your specific shop. Local labor rates, regional parts pricing, and shop-specific labor multipliers can shift the actual quote 20% or more in either direction.

Source 2

The customer's actual shop quote (Full Verdict only). When the customer uploads a shop inspection PDF, the line-by-line analysis cites the dollar figures from that document verbatim. Those numbers are not estimates — they are what the shop quoted.

What to do with a Verdict cost range

Use it as the band a fair local quote should fall into. Anything inside the range is platform-typical; anything well above the high end deserves a second quote; anything well below should make you check what parts the shop is sourcing.

Step six

Coverage matches are conditional — and we say so.

Every coverage program the engine surfaces (TSB, warranty extension, class-action settlement, federal statute) is matched against the vehicle on year + make + model + (sometimes) engine code. The match is the eligibility population — the cohort the program applies to.

What the engine cannot confirm: whether YOUR specific VIN has already been included in the campaign, whether the dealer administering the program has logged a previous repair against your VIN that disqualifies you, and whether your specific build date / production-window falls inside the narrowed eligibility band the manufacturer uses internally.

Per-VIN confirmation lives at the dealer, not in the Verdict. Every coverage match in the Verdict carries a verbatim escalation script — the exact sentence to read to a service writer so they can run the per-VIN eligibility check in five minutes. The Verdict tells you the program exists and may apply; the dealer's per-VIN system confirms whether it applies to you. The output validator enforces this hedge — any unhedged absolute-coverage claim in the prose gets rewritten to the verification- hedged equivalent before delivery.

The receipt

Every Verdict carries an audit-trail UUID.

The audit trail logs every adjustment that fired on every component, the failure-mode matches against the 733-pattern database, the coverage-triage triage block, and the validator-pass results. The UUID is stamped at the bottom of every Verdict. Years from now we can pull it and reproduce the exact reasoning that produced the score. Same VIN in, same Verdict out, every time.

How we score — the open documentation

Every adjustment, every threshold, every source.

Step one

Component baselines and their weights.

Component	Weight
Powertrain	25%
Drivetrain	15%
Body & Frame	15%
Brakes	10%
Suspension & Steering	10%
Electrical	10%
Emissions & Fuel System	8%
HVAC & Comfort	7%

Step two

The score-to-band mapping.

Score	Band	What this procedurally describes
93–100	Strong	Pristine condition; zero non-trivial findings. The audit trail places this vehicle inside the engine's highest tier.
85–92	Healthy	Solid shape with minor wear items only — maintenance, not concern.
76–84	Sound	Good condition, ordinary maintenance items present, nothing urgent.
68–75	Watching	Real items to monitor. Nothing failing today, but plan a service window in the next 6 months.
58–67	Needs Attention	Active concerns to address soon. Meaningful work pending.
45–57	Repair Window	The repair-to-value math is approaching a flip. Multiple active concerns or one severe finding. The repair conversation is real.
0–44	Replacement Window	The math has flipped against repair. Substantial active problems, branded title, or end-of-life territory. Replacement, sell-as-is, or strategic exit is on the table.

Step three

Every adjustment the engine can apply.

Ten named scoring rules. Each is gated by a feature flag and documented inline in the source file. Each rule has an empirical basis and the engine's audit trail records every time the rule fires.

HEALTHY_CAR_FLOOR_LIFT
baseline_floor 82 · perfect 88–95 · 1-notable 84–87 · multi-notable 80–83
The rule
A vehicle with zero Major findings, documented service, and no manifested failures has a floor on how low its score can land. Without this rule, a small accumulation of Notable items could push a clearly-healthy vehicle below 82, which misrepresents the platform.
Basis
Calibrated against the 478-verdict deep validation run. The floor catches "documented + zero majors + no manifested failures" baseline cases that should never land in the Watching range. Perfect ceiling lifted from 92 → 95 on 2026-05-22 so a truly flawless vehicle can reach the sanity-band hard ceiling.
apps/web/lib/verdict-v3/scoring-constants.ts §3.1
PLATFORM_TSB_DEWEIGHT
0 impact on managed · 1.0× multiplier on unmanaged
The rule
A platform-wide TSB (Technical Service Bulletin) the shop is already managing — oil-change interval adjustment, recall remedy applied, software update — contributes zero score impact. Only unaddressed symptoms on the same TSB family carry weight.
Basis
A TSB the customer has already complied with is by definition resolved. Penalizing a vehicle for a TSB the dealer has applied would punish good service history. The deweight makes the score reflect the vehicle's current state, not its platform history.
apps/web/lib/verdict-v3/scoring-constants.ts §3.2
COMPLETED_RECALL_LIFT
+4 points for closed major-class recall
The rule
A closed Major-class recall (battery / drivetrain / structural) materially changes the vehicle's reliability profile. The +4 lift reflects that the affected component is effectively newer than the chassis age suggests.
Basis
A 2017 Bolt EV with its LG battery pack replaced under NHTSA 21V-560 is running on a battery younger than the car. Treating it identically to an un-replaced 2017 Bolt would understate the actual condition.
apps/web/lib/verdict-v3/scoring-constants.ts §3.3
COVERAGE_HUNT_LIFT
+6 max (normal) · +2 max (tight) · 0 (no overlap)
The rule
When a coverage program — TSB, warranty extension, class-action settlement — matches the customer's vehicle AND overlaps with at least one quoted line item, the engine scores against NET out-of-pocket. Tightness (≤12 months OR ≤10% of mileage cap remaining) caps the lift at +2.
Basis
Customers who can claim a covered repair effectively face a lower cost-of-ownership profile than the gross repair total suggests. The math should reflect what the customer actually pays.
apps/web/lib/verdict-v3/scoring-constants.ts §3.4
NOTABLE_STACKING_DAMPING
1st 100% · 2nd 60% · 3rd 35% · 4th+ 20%
The rule
Multiple Notable-severity findings don't stack at full weight. The marginal information from findings 4, 5, 6 is diminishing — a customer with one Notable is meaningfully different from a customer with two; two from three; but findings beyond three are pattern-confirming, not additive.
Basis
Without damping, a vehicle with six minor wear items could score lower than a vehicle with two manifested Major failures, which misrepresents structural risk. Major and Critical findings remain at full weight always.
apps/web/lib/verdict-v3/scoring-constants.ts §3.5
MODERATE_CONFIDENCE_DOUBLE_PENALTY_FIX
Confidence affects display only, not score
The rule
A finding's confidence band controls how the prose describes it ("shudder pattern is consistent with..." vs "confirmed shudder pattern...") — not how much it deducts. Confidence and severity are independent.
Basis
Pre-fix, a Moderate-confidence finding got both a confidence-display hedge AND a confidence-multiplied deduction, double-counting the uncertainty. Now the deduction is uncoupled from confidence — the deduction reflects severity, the prose reflects confidence.
apps/web/lib/verdict-v3/scoring-constants.ts §3.6
END_OF_WINDOW_PENALTY
−3 points when a covered program is within 12 months or 10% of mileage cap
The rule
When a coverage program just paid out and the vehicle is within 12 months OR 10% of mileage of the cap, the customer is about to lose that protection. A small penalty reflects the elevated near-term risk.
Basis
A vehicle 8,000 miles from a coverage cap is in a different position than a vehicle 80,000 miles from it. The penalty pairs with COVERAGE_HUNT_LIFT — tightness caps the lift at +2 so the net is still positive, just smaller.
apps/web/lib/verdict-v3/scoring-constants.ts §3.7
PLATFORM_FAILURE_WEIGHT
Full weight retained when manifested, regardless of coverage
The rule
A vehicle currently experiencing a documented platform failure carries the full weight of that failure even when a coverage program covers the dollar cost. Coverage solves the dollar problem; it doesn't make the vehicle not-broken.
Basis
Pre-rule, a covered CVT failure could net out to zero score impact because COVERAGE_HUNT_LIFT offset the deduction. But the customer still has a CVT failure — that's a real platform signal the score should reflect.
apps/web/lib/verdict-v3/scoring-constants.ts §3.8
SCORE_SANITY_BANDS
Hard ceiling 95 · soft warn at 35 · hard floor 30
The rule
No Verdict may exceed 95 without the sanity_override flag in the audit trail. No Verdict may drop below 30 without total_loss_eligible. Scores between 30 and 35 receive a soft-warn note instructing the prose to acknowledge the marginal position.
Basis
Safety net against any edge case the calibration didn't anticipate. The hard ceiling preserves a 5-point buffer below a perfect 100 to acknowledge that no vehicle is literally perfect; the hard floor preserves a 30-point band of bottom-tier scores before total-loss territory.
apps/web/lib/verdict-v3/scoring-constants.ts §3.9
CRITICAL_COMPONENT_DRAG_RULE
Base score capped at (min_component + 30) when any component ≤ 35
The rule
When any single component scores at or below 35 (Replacement-Window territory), the weighted-average base score is capped at the minimum component score + 30 points. A failing HV battery at 22 caps overall at 52, not 73.
Basis
Cal-West round-4 validation found that a 2015 Leaf with a 53%-SoH HV battery scored 73 because healthy peripherals weighted-averaged past the dominant concern. That misleads the customer — the dominant concern is the dominant concern.
apps/web/lib/verdict-v3/stages/stage-10-aggregation.ts (Cal-West v3.8.3 rule)

Step four

What runs before the Verdict ships.

Post-compose deterministic validators that catch anything the LLM-driven audit pass might miss. Every customer-facing Verdict runs through this stack before it's delivered.

output-validator.ts

Step five

What the cost ranges mean — and what they don't.

Every cost range in a Verdict comes from one of two sources. Both are sourced — neither is invented by the language model.

Source 1

Source 2

What to do with a Verdict cost range

Step six

Coverage matches are conditional — and we say so.

The receipt

Component baselines and their weights.

The score-to-band mapping.

Every adjustment the engine can apply.

HEALTHY_CAR_FLOOR_LIFT

PLATFORM_TSB_DEWEIGHT

COMPLETED_RECALL_LIFT

COVERAGE_HUNT_LIFT

NOTABLE_STACKING_DAMPING

MODERATE_CONFIDENCE_DOUBLE_PENALTY_FIX

END_OF_WINDOW_PENALTY

PLATFORM_FAILURE_WEIGHT

SCORE_SANITY_BANDS

CRITICAL_COMPONENT_DRAG_RULE

What runs before the Verdict ships.

output-validator.ts

What the cost ranges mean — and what they don't.

Coverage matches are conditional — and we say so.

Every Verdict carries an audit-trail UUID.

Component baselines and their weights.

The score-to-band mapping.

Every adjustment the engine can apply.

HEALTHY_CAR_FLOOR_LIFT

PLATFORM_TSB_DEWEIGHT

COMPLETED_RECALL_LIFT

COVERAGE_HUNT_LIFT

NOTABLE_STACKING_DAMPING

MODERATE_CONFIDENCE_DOUBLE_PENALTY_FIX

END_OF_WINDOW_PENALTY

PLATFORM_FAILURE_WEIGHT

SCORE_SANITY_BANDS

CRITICAL_COMPONENT_DRAG_RULE

What runs before the Verdict ships.

output-validator.ts

What the cost ranges mean — and what they don't.

Coverage matches are conditional — and we say so.

Every Verdict carries an audit-trail UUID.