Skip to content

feat(ci): surface vitest parallelism metrics in the isolate:false test gate#6004

Merged
killagu merged 2 commits into
eggjs:nextfrom
killagu:feat/vitest-parallelism-metrics
Jun 27, 2026
Merged

feat(ci): surface vitest parallelism metrics in the isolate:false test gate#6004
killagu merged 2 commits into
eggjs:nextfrom
killagu:feat/vitest-parallelism-metrics

Conversation

@killagu

@killagu killagu commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Motivation

The suite already runs with pool: 'threads' + isolate: false for full parallelism. This PR makes the effect of that parallelism visible in CI — so we can see how parallel the run actually was, per OS/Node matrix entry — without changing gate semantics.

Independent of #5986 (tegg TeggScope): every file touched here is identical to next, so this branch is cut from next and can merge on its own timeline.

What

  • vitest.config.ts — adds a json reporter only when CI is set, writing benchmark/ci-test/ci-run/vitest-results.json as a side effect of the gating ut run ci (default console reporter preserved; no extra test run).
  • .github/workflows/ci.yml — a Report parallelism metrics step (if: always()) in the existing test job summarizes that JSON and appends the report to the GitHub job summary. Gating still comes solely from ut run ci.
  • scripts/ci-test-benchmark/ — the existing harness gains a --report-only --vitest-json mode and computes peak/avg concurrency, parallel efficiency, and critical path from a per-file concurrency-timeline sweep.
  • test(supertest) (separate commit) — harden the over-strict should handle connection error test (hardcoded 127.0.0.1:1234) to accept the connection-error family instead of exactly ECONNREFUSED, so it isn't environment-dependent (a local proxy on that port made it socket hang up).

Reading the metrics honestly

Vitest 4 derives a file's interval from test-level timings only, so the spans cover test bodies + per-test beforeEach/afterEach but exclude suite-level beforeAll/afterAll (where egg boots its apps) and module transform/import. Therefore avg concurrency / parallel efficiency are lower bounds; peak concurrency is the robust signal. The worker ceiling mirrors vitest.config.ts (Windows CI caps workers). Fully-skipped files are dropped from the timeline. The report footnote states all of this.

Test evidence

  • Full isolate: false suite is GREEN (services up, CI-faithful --maxWorkers 4): 526 files / 3425 tests pass, 0 failures.
  • The edited config loads in real Vitest (both reporters active, JSON at the configured path); harness math cross-checked independently; --report-only exits 0 on missing JSON; step-summary append is cross-platform; oxfmt/oxlint clean.
  • A 4-agent adversarial review pass was applied (Windows worker-ceiling, beforeAll-exclusion docs, skipped-file timeline, report-only run section all fixed).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added CI job summary reporting for test parallelism and efficiency metrics.
    • Introduced a report-only benchmark mode for summarizing existing test JSON output.
  • Bug Fixes

    • Made connection-error tests more resilient across environments by accepting common failure variants.
  • Documentation

    • Expanded benchmark and workflow docs with usage examples, metric definitions, and CI reporting details.

killagu and others added 2 commits June 27, 2026 21:06
The suite runs with isolate:false for full parallelism (made safe by the tegg
TeggScope per-app isolation). Instrument the existing test gating job to show how
parallel it actually ran, without changing gate semantics:

- vitest.config.ts emits a JSON reporter in CI to benchmark/ci-test/ci-run
- a "Report parallelism metrics" step (always()) summarizes that JSON via the
  ci-test-benchmark harness and appends it to the GitHub job summary per matrix entry
- the harness gains a --report-only/--vitest-json mode and computes avg/peak
  concurrency, parallel efficiency (vs the effective worker ceiling) and the critical
  path from a per-file concurrency-timeline sweep

Metrics are honest lower bounds: Vitest per-file spans exclude suite-level
beforeAll/afterAll and module transform/import, so peak concurrency is the robust
signal. Fully-skipped files are dropped from the timeline.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…FUSED

The "should handle connection error" test hardcodes 127.0.0.1:1234 and asserted the
exact ECONNREFUSED message. A local listener on that port (e.g. a proxy) makes
supertest surface "socket hang up" instead, failing the test in an
environment-dependent way. Assert the connection-error family so the test verifies
error handling regardless of what (if anything) holds the port.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5af87ee2-2302-4f7b-9dbc-54d6b7511367

📥 Commits

Reviewing files that changed from the base of the PR and between 42f7a14 and 030400c.

📒 Files selected for processing (15)
  • .github/workflows/ci.yml
  • .gitignore
  • benchmark/ci-test/README.md
  • packages/supertest/test/supertest.test.ts
  • scripts/ci-test-benchmark/cli.js
  • scripts/ci-test-benchmark/environment.js
  • scripts/ci-test-benchmark/fs.js
  • scripts/ci-test-benchmark/index.js
  • scripts/ci-test-benchmark/report.js
  • scripts/ci-test-benchmark/vitest-summary.js
  • vitest.config.ts
  • wiki/concepts/vitest-isolate-false-state-leaks.md
  • wiki/index.md
  • wiki/log.md
  • wiki/workflows/ci-parallel-test-metrics.md

📝 Walkthrough

Walkthrough

Adds a non-gating CI step that emits Vitest test timing as JSON (via a conditional reporter in vitest.config.ts) and processes it through a new --report-only mode in the ci-test-benchmark harness to compute and append parallelism metrics (avg/peak concurrency, efficiency, critical path) to the GitHub Actions job summary.

Changes

CI Parallelism Metrics Instrumentation

Layer / File(s) Summary
Vitest JSON reporter and CI step wiring
vitest.config.ts, .github/workflows/ci.yml, .gitignore
vitest.config.ts conditionally adds a JSON reporter writing to benchmark/ci-test/ci-run/vitest-results.json when CI is set. The CI workflow adds an always()-guarded step running node scripts/ci-test-benchmark.js in report-only mode. Benchmark output directories are gitignored.
Per-file interval extraction and parallelism computation
scripts/ci-test-benchmark/vitest-summary.js
summarizeFileResult extracts absolute wall-clock startMs/endMs per file via getInterval(). computeParallelism() aggregates those intervals into execution window, busy time, peak concurrency (sweep-line), average concurrency, and critical-path estimate, exposed as a parallelism field on the summary.
Environment worker ceiling and fs helpers
scripts/ci-test-benchmark/environment.js, scripts/ci-test-benchmark/fs.js
collectEnvironment derives availableParallelism and a Windows-CI-aware workerCeiling from os.availableParallelism() and vitestConfig.maxWorkers. readVitestConfigDefaults gains matchNumberProperty for numeric config parsing. fs.js exports appendText() backed by appendFile.
CLI report-only mode
scripts/ci-test-benchmark/cli.js
parseArgs gains reportOnly and vitestJson fields and handles --report-only and --vitest-json[=path] flags. Help text documents the new options and GITHUB_STEP_SUMMARY behavior.
Report parallelism section and conditional run rows
scripts/ci-test-benchmark/report.js
createReport accepts a reportOnly flag. renderMarkdown conditionally emits execution-window rows vs. exit/wall-time rows. appendParallelismSection renders avg/peak concurrency, efficiency, headroom, and worker ceiling. formatConcurrency and formatPercent format finite values with n/a fallback.
Main orchestration: report-only flow and step summary
scripts/ci-test-benchmark/index.js
Routes to runReportOnly() when options.reportOnly is set, loads Vitest JSON, synthesizes run timing via deriveRunFromJson(), writes outputs, and appends markdown to GITHUB_STEP_SUMMARY. Normal mode also appends to the step summary via appendStepSummary().
Test hardening and docs
packages/supertest/test/supertest.test.ts, benchmark/ci-test/README.md, wiki/workflows/ci-parallel-test-metrics.md, wiki/concepts/vitest-isolate-false-state-leaks.md, wiki/index.md, wiki/log.md
Supertest connection-error assertion replaced with a multi-variant regex. README, wiki index, log, and a new workflow doc updated with metric definitions, CI integration description, honesty caveats about Vitest timing spans, and local reproduction commands.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • eggjs/egg#5912: Implements the original ci-test-benchmark harness that this PR extends with report-only mode, parallelism computation, and GitHub step-summary output.

Suggested reviewers

  • fengmk2

🐇 Hops through the CI lane,
JSON timing data in paw,
Peak concurrency measured plain,
No gate broken, no test flaw—
The summary glows, metrics plain! 🌟

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces single-run parallelism metrics reporting for the isolate:false test suite in CI, automatically appending a Markdown summary to the GitHub Actions job summary. It also hardens a connection error test in supertest to prevent environmental failures. The review feedback suggests returning early if the Vitest JSON is missing to avoid empty reports in CI, and replacing the spread operator in Math.min and Math.max calls with a single loop to prevent potential call stack size exceeded errors on large test suites.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +81 to +83
if (!vitestJson) {
console.warn(`No Vitest JSON found at ${vitestJsonPath}; nothing to report.`);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If vitestJson is not found, the script currently logs a warning but continues execution, which results in writing empty reports with n/a values and appending them to the GitHub step summary. Returning early here avoids polluting the CI job summary with empty tables when tests fail or are skipped.

Suggested change
if (!vitestJson) {
console.warn(`No Vitest JSON found at ${vitestJsonPath}; nothing to report.`);
}
if (!vitestJson) {
console.warn(`No Vitest JSON found at ${vitestJsonPath}; nothing to report.`);
return;
}

Comment on lines +119 to +120
const firstStart = Math.min(...intervals.map((interval) => interval.start));
const lastEnd = Math.max(...intervals.map((interval) => interval.end));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the spread operator with Math.min and Math.max on an array of unbounded size (like test files in a monorepo) can cause a RangeError: Maximum call stack size exceeded if the number of files grows very large. A single loop is safer, more memory-efficient, and avoids creating intermediate arrays.

  let firstStart = Infinity;
  let lastEnd = -Infinity;
  for (const interval of intervals) {
    if (interval.start < firstStart) firstStart = interval.start;
    if (interval.end > lastEnd) lastEnd = interval.end;
  }

Comment on lines +121 to +122
const firstStart = Math.min(...intervals.map((interval) => interval.start));
const lastEnd = Math.max(...intervals.map((interval) => interval.end));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the spread operator with Math.min and Math.max on an array of unbounded size (like test files in a monorepo) can cause a RangeError: Maximum call stack size exceeded if the number of files grows very large. A single loop is safer, more memory-efficient, and avoids creating intermediate arrays.

Suggested change
const firstStart = Math.min(...intervals.map((interval) => interval.start));
const lastEnd = Math.max(...intervals.map((interval) => interval.end));
let firstStart = Infinity;
let lastEnd = -Infinity;
for (const interval of intervals) {
if (interval.start < firstStart) firstStart = interval.start;
if (interval.end > lastEnd) lastEnd = interval.end;
}

@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.88%. Comparing base (42f7a14) to head (030400c).
⚠️ Report is 2 commits behind head on next.

Additional details and impacted files
@@           Coverage Diff           @@
##             next    #6004   +/-   ##
=======================================
  Coverage   84.88%   84.88%           
=======================================
  Files         674      674           
  Lines       20269    20269           
  Branches     4039     4039           
=======================================
  Hits        17205    17205           
  Misses       2633     2633           
  Partials      431      431           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@socket-security

Copy link
Copy Markdown

Dependency limit exceeded — report not shown.

This pull request scan exceeded the 10,000-dependency limit applied to this scan, so the results are incomplete and may be inaccurate. To avoid reporting false positives, Socket has not posted a report.

Upgrade your plan to raise the dependency limit and get complete reports, or view the partial scan in the dashboard.

Socket is always free for open source. If this is a non-commercial open source project, contact us to request a free Team account.

@killagu killagu marked this pull request as ready for review June 27, 2026 14:03
Copilot AI review requested due to automatic review settings June 27, 2026 14:03
@killagu killagu merged commit f9662dc into eggjs:next Jun 27, 2026
21 checks passed

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants