Skip to content

fix(auto-routing): reduce classifier errors and latency#3942

Merged
iscekic merged 5 commits into
mainfrom
fix/auto-routing-classifier-reliability
Jun 11, 2026
Merged

fix(auto-routing): reduce classifier errors and latency#3942
iscekic merged 5 commits into
mainfrom
fix/auto-routing-classifier-reliability

Conversation

@iscekic

@iscekic iscekic commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Improves the auto-routing classifier by switching to the lower-latency google/gemini-2.5-flash-lite default, shrinking the classifier prompt, and capping classifier completions at 160 tokens.

Moves classifier output handling into services/auto-routing/src/classifier-output/ with an isolated, tested parser. The parser accepts common model-output drift such as fenced JSON, wrappers, enum labels, snake_case keys, confidence strings, and subtype/task mismatches. If the model returns unusable output, the worker now emits a valid low-confidence fallback classification and logs auto_routing_classifier_fallback separately from classifier errors. Classified successes do not emit custom success logs.

Adds admin analytics for task/subtask pairs and records classifier failures as classifier_error:<subtype> statuses in Analytics Engine, so the admin status breakdown can show the distribution of classifier failure modes.

Verification

Production Axiom check for deployed version 358ed06e-7b68-4415-aa2d-3324adc7cce0 over a 25-minute window before success logs were removed: 35,961 invocations, 0 classifier error logs, 235 fallback logs, and 332 sampled success latency logs. Classifier error rate was 0%; sampled success p95 latency was 1212.19ms, with p99 at 1697.55ms.

Fallbacks were mostly unusable model outputs with unrelated keys such as subcategory, minecraft, and selectedTickers; these are now visible through the separate fallback log event and produce confidence 0 classifications.

Visual Changes

Admin Auto Routing panel now includes a Task Subtypes breakdown table. Screenshot not captured because the admin panel requires live admin auth.

Reviewer Notes

The preferred p95 target was under 1000ms, but the production sample landed at 1212.19ms. This is below the acceptable 2000ms p95 target while meeting the stricter reliability target.

Remote production KV classifier_model is set to google/gemini-2.5-flash-lite, matching the code default.

The classifier error summary query counts both historical classifier_error rows and new classifier_error:<subtype> rows, while the status breakdown preserves the subtype for distribution analysis.

@iscekic iscekic self-assigned this Jun 10, 2026
@kilo-code-bot

kilo-code-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Executive Summary

The latest commit removes the logClassifierSuccessSample function and its 1%-sampled console.info call, along with the corresponding test assertions — a clean noise-reduction refactor with no logic changes or new code paths.

Files Reviewed (incremental — 1 new commit ea61939)
  • services/auto-routing/src/decide.tslogClassifierSuccessSample function and CLASSIFIER_SUCCESS_LOG_SAMPLE_RATE constant removed; call site removed from decideHandler; no remaining issues
  • services/auto-routing/src/index.test.ts — test updated to assert infoSpy is never called; correct and consistent
Previously reviewed files (no changes, findings carried forward)
  • services/auto-routing/src/classifier-analytics.ts — no issues
  • services/auto-routing/src/admin-classifier-analytics.ts — no issues
  • packages/auto-routing-contracts/src/index.ts — no issues
  • apps/web/src/app/admin/auto-routing/AutoRoutingAdminContent.tsx — no issues
  • apps/web/src/lib/ai-gateway/auto-routing-admin-client.test.ts — no issues
  • packages/auto-routing-contracts/src/contracts.test.ts — no issues

Reviewed by claude-4.6-sonnet-20260217 · 287,717 tokens

Review guidance: REVIEW.md from base branch main

@iscekic iscekic enabled auto-merge (squash) June 11, 2026 11:28
@iscekic iscekic merged commit 39956d8 into main Jun 11, 2026
57 checks passed
@iscekic iscekic deleted the fix/auto-routing-classifier-reliability branch June 11, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants