Add thread-safety contention tests for ISOProber stats and Scheduler …#77
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA new test module adds multithreaded stress tests for ChangesThread Safety Contention Tests
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_thread_safety.py`:
- Around line 121-122: The worker exception handling is too broad and catches
fatal control-flow exceptions; in the thread worker blocks in
test_thread_safety.py, replace the BaseException handlers with Exception so
KeyboardInterrupt and SystemExit are not swallowed, and update the error
collection type annotation used by record_error/its error list to match
Exception rather than BaseException in each affected worker section.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1050a0ce-15b6-465d-9e7a-eeebd37d5bd3
📒 Files selected for processing (1)
tests/test_thread_safety.py
|
@coderabbitai full review again |
🧩 Analysis chain🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "=== tests/test_thread_safety.py ==="
cat -n tests/test_thread_safety.py
echo ""
echo "=== Finding ISOProber source ==="
fd -e py | xargs grep -l "class ISOProber" 2>/dev/null || true
echo ""
echo "=== Finding Scheduler source ==="
fd -e py | xargs grep -l "class Scheduler" 2>/dev/null || true
echo ""
echo "=== Listing all py files ==="
fd -e py | sortLength of output: 10487 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "=== conftest.py ==="
fd conftest.py | xargs cat -n 2>/dev/null || true
echo ""
echo "=== existing tests ==="
fd -e py . tests/ | sort | head -30Length of output: 17744 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "=== ISOProber relevant parts ==="
ast-grep outline src/paperscout/sources.py
echo ""
echo "=== Scheduler relevant parts ==="
ast-grep outline src/paperscout/monitor.pyLength of output: 2804 🏁 Script executed: #!/bin/bash
set -euo pipefail
echo "=== ISOProber: _stats_lock, _bump_stat, _reset_stats, snapshot_stats, _STATS_TEMPLATE ==="
rg -n "_stats_lock|_bump_stat|_reset_stats|snapshot_stats|_STATS_TEMPLATE" src/paperscout/sources.py -A3 -B1
echo ""
echo "=== Scheduler: _health_lock, _publish_health_snapshot, health_snapshot ==="
rg -n "_health_lock|_publish_health_snapshot|health_snapshot|_compute_probe_success_rate" src/paperscout/monitor.py -A5 -B1
echo ""
echo "=== _compute_probe_success_rate ==="
rg -n "_compute_probe_success_rate" src/paperscout/monitor.py -A10Length of output: 7576
Review of
|
| # | Severity | Item |
|---|---|---|
| 1 | Medium | Missing threading.Barrier in test_snapshot_stats_consistent_under_concurrent_reset |
| 2 | Low | _assert_health_snapshot_consistent leaves last_cycle_status, last_cycle_error, probe_success_rate type-unchecked |
| 3 | Low | Direct mutation of private scheduler fields needs an explanatory comment |
| 4 | Low | THREAD_JOIN_TIMEOUT = 5.0 may be insufficient on slow CI |
| 5 | Low | No pytest.mark to allow filtering stress tests |
| 6 | Info | FakePool is not thread-safe; fine today, needs a note for future contributors |
The most impactful fix is item 1 — without a barrier, the resetter/snapshot/bump contention test has much lower adversarial overlap than intended.
✅ Action performed
Full review finished.
There was a problem hiding this comment.
🧹 Nitpick comments (4)
tests/test_thread_safety.py (4)
33-33: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueConsider a more generous join timeout for slower CI.
A 5s join budget can be tight on heavily loaded or free-threaded CI runners executing thousands of contended iterations, leading to flaky
did not finishfailures rather than real defects. Bumping to ~15–30s (or sourcing from an env var) keeps the safety net while reducing false negatives.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_thread_safety.py` at line 33, The thread join timeout in the thread-safety test is too tight for slower CI environments. Update the THREAD_JOIN_TIMEOUT constant in test_thread_safety.py to a more generous value, or make it configurable via an environment variable, so the join logic in the test remains stable under heavy load without causing flaky “did not finish” failures.
172-189: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueAdd a brief note on the direct private-attribute writes.
The writer thread deliberately bypasses the normal cycle flow to mutate
_poll_count/_last_probe_stats/_last_cycle_status/_last_successful_polland call_publish_health_snapshot()directly. A one-line comment stating this intentionally emulates the event-loop writer will prevent future readers from mistaking it for a test reaching into internals by accident.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_thread_safety.py` around lines 172 - 189, Add a brief inline note in the writer helper to explain that it intentionally writes the private scheduler state directly and then calls _publish_health_snapshot() to emulate the event-loop writer path; mention the specific private attributes _poll_count, _last_probe_stats, _last_cycle_status, and _last_successful_poll so future readers know this is deliberate and not accidental test coupling.
78-86: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winStrengthen the snapshot invariant checks.
The helper validates
poll_count,probe_stats,probe_success_rate, andlast_updated, but skipslast_cycle_status,last_cycle_error, andlast_successful_poll. Since the writer mutates these under contention (lines 185-188), asserting their types would catch a torn/inconsistent snapshot that the current checks would miss.♻️ Suggested additional assertions
assert snap["probe_success_rate"] == _compute_probe_success_rate(snap["probe_stats"]) last_updated = snap["last_updated"] assert last_updated is None or (isinstance(last_updated, str) and len(last_updated) > 0) + assert snap["last_cycle_status"] is None or isinstance(snap["last_cycle_status"], CycleStatus) + assert snap["last_cycle_error"] is None or isinstance(snap["last_cycle_error"], str) + last_poll = snap["last_successful_poll"] + assert last_poll is None or isinstance(last_poll, (int, float, str))Adjust the expected types to match
health_snapshot()'s actual output contract.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_thread_safety.py` around lines 78 - 86, The snapshot validator in _assert_health_snapshot_consistent only checks part of the health snapshot contract, so extend it to assert the types of last_cycle_status, last_cycle_error, and last_successful_poll as well. Update the expectations to match health_snapshot()’s actual output shape by verifying these fields are present and have the correct types (including allowed None cases where applicable), alongside the existing checks for poll_count, probe_stats, probe_success_rate, and last_updated.
140-149: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAdd a
threading.Barrierso workers actually contend.Threads are started sequentially, and the resetter does the least work per iteration, so it can finish (or mostly finish) its
RESET_ITERATIONSbefore the snapshotters/bumpers ramp up, sharply reducing the overlap this test is meant to exercise. Gating all five workers behind a barrier (as intest_concurrent_bump_stat_totals) guarantees simultaneous contention on_stats_lock.♻️ Suggested change
prober = _make_prober(fake_pool) errors: list[Exception] = [] errors_lock = threading.Lock() + barrier = threading.Barrier(5) def record_error(exc: Exception) -> None: with errors_lock: errors.append(exc) def resetter() -> None: try: + barrier.wait() for _ in range(RESET_ITERATIONS): prober._reset_stats() except Exception as exc: record_error(exc) def snapshotter() -> None: try: + barrier.wait() for _ in range(SNAPSHOT_ITERATIONS): _assert_valid_probe_stats(prober.snapshot_stats()) except Exception as exc: record_error(exc) def bumper() -> None: try: + barrier.wait() for _ in range(BUMPER_ITERATIONS): prober._bump_stat("miss") except Exception as exc: record_error(exc)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_thread_safety.py` around lines 140 - 149, The concurrency setup in the test does not force real overlap, so the workers may not contend on _stats_lock as intended. Update the thread worker flow in test_thread_safety, using threading.Barrier the same way as test_concurrent_bump_stat_totals, so resetter, snapshotter, and bumper threads all wait and start together before entering their loops. Keep the change localized to the thread setup and worker start logic so the existing resetter, snapshotter, bumper, and _join_threads structure remains intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/test_thread_safety.py`:
- Line 33: The thread join timeout in the thread-safety test is too tight for
slower CI environments. Update the THREAD_JOIN_TIMEOUT constant in
test_thread_safety.py to a more generous value, or make it configurable via an
environment variable, so the join logic in the test remains stable under heavy
load without causing flaky “did not finish” failures.
- Around line 172-189: Add a brief inline note in the writer helper to explain
that it intentionally writes the private scheduler state directly and then calls
_publish_health_snapshot() to emulate the event-loop writer path; mention the
specific private attributes _poll_count, _last_probe_stats, _last_cycle_status,
and _last_successful_poll so future readers know this is deliberate and not
accidental test coupling.
- Around line 78-86: The snapshot validator in
_assert_health_snapshot_consistent only checks part of the health snapshot
contract, so extend it to assert the types of last_cycle_status,
last_cycle_error, and last_successful_poll as well. Update the expectations to
match health_snapshot()’s actual output shape by verifying these fields are
present and have the correct types (including allowed None cases where
applicable), alongside the existing checks for poll_count, probe_stats,
probe_success_rate, and last_updated.
- Around line 140-149: The concurrency setup in the test does not force real
overlap, so the workers may not contend on _stats_lock as intended. Update the
thread worker flow in test_thread_safety, using threading.Barrier the same way
as test_concurrent_bump_stat_totals, so resetter, snapshotter, and bumper
threads all wait and start together before entering their loops. Keep the change
localized to the thread setup and worker start logic so the existing resetter,
snapshotter, bumper, and _join_threads structure remains intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 01ad4a36-6c25-4e43-a607-c6f2903402ae
📒 Files selected for processing (1)
tests/test_thread_safety.py
…so reset/snapshot/bump threads start concurrently. - Extended _assert_health_snapshot_consistent with str/None checks for last_cycle_status, last_cycle_error, and last_successful_poll, plus a float check for probe_success_rate (using str for status, not CycleStatus, matching production). - Documented that same-thread attribute writes happen-before _publish_health_snapshot() reads; _health_lock only guards cross-thread snapshot access. - Added module docstring NOTE that FakePool is not thread-safe and these tests don't call pool methods concurrently.
Summary
tests/test_thread_safety.pywith OS-thread stress tests for lock-protected paths inISOProberandScheduler.ISOProber._stats_lockunder concurrent_bump_stat()/_reset_stats()/snapshot_stats()access.Scheduler._health_lockunder concurrent_publish_health_snapshot()(single writer) andhealth_snapshot()reads (multiple reader threads), matching the production event-loop writer / health-server reader model.Context
T13 (week 4).
ISOProber._statsandScheduler._health_snapshotare guarded bythreading.Lock, but previously had no contention tests. Existing coverage was either async-only (test_run_cycle_stats_integrity_under_concurrency) or used a HTTP stand-in (test_health_snapshot_consistent_under_concurrent_updates) rather than the realScheduler.No production code changes.
Tests added
test_concurrent_bump_stat_totalsmiss == 3200test_snapshot_stats_consistent_under_concurrent_resettest_health_snapshot_consistent_under_concurrent_publishprobe_success_rate == _compute_probe_success_rate(probe_stats)Test plan
uv run pytest tests/test_thread_safety.py -vuv run pytest tests/ --cov=paperscout --cov-fail-under=90Related issues
Summary by CodeRabbit