Skip to content

fix(sampling): migrate budget forcing to mot.generation.usage; drop _meta["usage"] writes#1288

Merged
ajbozarth merged 1 commit into
generative-computing:mainfrom
ajbozarth:fix/1218-backends-unify-mot-generation
Jun 17, 2026
Merged

fix(sampling): migrate budget forcing to mot.generation.usage; drop _meta["usage"] writes#1288
ajbozarth merged 1 commit into
generative-computing:mainfrom
ajbozarth:fix/1218-backends-unify-mot-generation

Conversation

@ajbozarth

Copy link
Copy Markdown
Contributor

Pull Request

Issue

Fixes #1218

Description

  • Migrates think_budget_forcing to read mot.generation.usage exclusively. Raises a named RuntimeError when usage is None, replacing the previous opaque TypeError from arithmetic on None (Ollama's eval_count can be None).
  • Drops _meta["usage"] writes from all five backends' generate_from_raw paths now that mot.generation.usage is the unified field. The additive half (writing the field from raw paths) landed with refactor(backends): move generate_from_raw hook firing into Backend base class #1264.
  • Replaces an implicit IndexError in the no-sub-calls edge case (think_max_tokens=0 + answer_suffix=None) with a named RuntimeError.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code was added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

  • AI coding assistants used

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

  • Component
  • Requirement
  • Sampling Strategy
  • Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

…meta["usage"] writes

Migrate the only in-tree consumer (`think_budget_forcing`) off
`mot._meta["usage"]` and onto `mot.generation.usage`, then drop the
now-redundant `_meta["usage"]` writes from all five backends'
`generate_from_raw` paths. The additive half (writing
`mot.generation.usage` from raw paths) landed with generative-computing#1264.

Budget forcing now raises a named `RuntimeError` when a sub-call
returns `mot.generation.usage = None` instead of letting a
confusing `TypeError` bubble out of arithmetic on `None`. Also
replaces an implicit `IndexError` in the
`think_max_tokens=0`/`answer_suffix=None` no-sub-calls edge case
with a named error.

Closes generative-computing#1218

Assisted-by: Claude Code
Signed-off-by: Alex Bozarth <ajbozart@us.ibm.com>
@ajbozarth ajbozarth requested a review from a team as a code owner June 17, 2026 21:26
@github-actions github-actions Bot added the bug Something isn't working label Jun 17, 2026
@ajbozarth ajbozarth self-assigned this Jun 17, 2026
@ajbozarth ajbozarth requested a review from avinash2692 June 17, 2026 21:34
@akihikokuroda

Copy link
Copy Markdown
Member

Comment by Claude:

Optional but Recommended: Add Error Case Tests

The PR introduces two new RuntimeError paths that should ideally have test coverage:

  1. Missing token counts — When generation.usage is None
  2. No generations produced — When think_max_tokens=0 and answer_suffix=None

These are edge cases that might benefit from unit tests in test/stdlib/sampling/test_think_budget_forcing.py, but they're currently not critical since the budget forcing tests are integration/e2e level (marked with @pytest.mark.e2e, @pytest.mark.qualitative).

Recommendation

The PR can merge without test changes, but to be thorough, you could add a unit test like:
@pytest.mark.unit
def test_think_budget_forcing_no_usage_raises():
"""Verify RuntimeError when backend returns None usage."""
# Would need to mock a backend that returns None for generation.usage

This would ensure the new error paths are exercised, but it's not blocking — existing tests will validate the happy path works correctly.

@ajbozarth

Copy link
Copy Markdown
Contributor Author

These are edge cases that might benefit from unit tests in test/stdlib/sampling/test_think_budget_forcing.py, but they're currently not critical since the budget forcing tests are integration/e2e level (marked with @pytest.mark.e2e, @pytest.mark.qualitative).

This did come up when writing but I decided not to since there are no unit tests currently, so I would be introducing them. If including unit tests feels like a blocker I can add them.

@ajbozarth ajbozarth added this pull request to the merge queue Jun 17, 2026
Merged via the queue into generative-computing:main with commit 9b3a3c2 Jun 17, 2026
10 checks passed
@ajbozarth ajbozarth deleted the fix/1218-backends-unify-mot-generation branch June 17, 2026 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(backends): unify raw-path token usage on mot.generation.usage; guard eval_count=None

2 participants