feat: v1 semantic convention#717
Conversation
The design goals and transitioning goals for linopy's v1 arithmetic convention, under arithmetics-design/goals.md. The convention itself and the bug catalogue (meta issue #714) follow separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Placeholder for the v1 convention document, to be written. Goals are in arithmetics-design/goals.md; the bug catalogue is the meta issue #714. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
40af9d6 to
1e336a4
Compare
Flesh out convention.md from the placeholder into the full spec — thirteen numbered sections in three groups: absence (§1–§7), coordinate alignment (§8–§11), and constraints and reductions (§12–§13). Covers the strict exact-match alignment model and the propagate-don't-fill NaN/absence convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The convention governs coordinate alignment, absence/NaN handling, constraints, and reductions — not just arithmetic operators — so retitle convention.md and goals.md to "The v1 convention". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce linopy.options["semantics"] — legacy (default) or v1 — with LinopySemanticsWarning, a FutureWarning shown to users by default and exported at top level. Add the autouse `semantics` conftest fixture that runs every test under both conventions, plus legacy/v1 markers to pin a test to one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`_align_constant` branches on `options["semantics"]`: v1 uses exact alignment via `xr.align(join="exact")`; legacy keeps the size-aware positional/left-join behaviour and emits `LinopySemanticsWarning` when v1 would diverge. `_add_constant`/`_apply_constant_op` raise on a NaN in a user-supplied constant under v1, warn under legacy. `Variable.__mul__(DataArray)` now routes through `to_linexpr() * other` so the LinearExpression checks fire; the scalar fast-path is preserved (a NaN scalar diverts to the expression path so v1 raises). Marks the bug-class test groups `TestCoordinateAlignment` (#708/#586/ #550), `TestConstraintCoordinateAlignment`, `TestNaNMasking`, `test_auto_mask_constraint_model`, and four piecewise NaN-padding tests as `@pytest.mark.legacy` — they assert the very behaviour v1 forbids. v1 coverage of those bug classes accretes via later slices. `test/test_legacy_violations.py` (new) adds 22 paired tests covering §5/§8/§9 plus the PyPSA #1683 `0*inf=NaN` case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`merge` now pre-validates that all operands agree on the labels of every shared *user* dimension before concatenating. Helper dims (`_term`, `_factor`) and the concat dim itself are excluded — those legitimately vary between operands. v1 raises on mismatch; legacy keeps current size-based override/outer behaviour and emits `LinopySemanticsWarning` when v1 would diverge. The check uses a new `_merge_shared_user_coords_differ` helper. The existing override/outer decision is unchanged for the actual `xr.concat` call — the new check only gates whether legacy/v1 accept the merge, never how the concat itself runs. Adds 8 paired tests for var+var, var-var, expr+expr, broadcast guard, and warning emission on the merge path. Reclassifies as `@pytest.mark.legacy`: `test_non_aligned_variables` (deliberately disjoint coords), `test_linear_expression_sum` / `test_linear_expression_sum_with_const` (assert `v.loc[:9]+v.loc[10:]` merges), `TestJoinParameter` cases that build `a*b` from mismatched- coord vars, and two SOS2 reformulation tests. File-level legacy mark on `test_piecewise_constraints.py` + `test_piecewise_feasibility.py` until `linopy/piecewise.py` itself is made v1-aware (tracked as Slice P). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Variable.to_linexpr() now produces a LinearExpression whose absent slots (labels == -1) carry NaN coeffs and NaN const under v1, so downstream arithmetic has something to propagate. The expression constant operators (_add_constant, _apply_constant_op) no longer fillna(0) self.const / self.coeffs under v1 — NaN flows through. `merge` sums const along _term with skipna=False under v1, so a slot that's absent in any operand stays absent in the result. Legacy paths keep the silent-fill behaviour verbatim. LinearExpression.isnull() now returns `const.isnull()` under v1: a slot is absent iff its const is NaN. ``vars == -1`` is a dead-term signal (the slot can still be a present constant after fillna), not a slot-level absence marker. Legacy keeps the historical ``(vars == -1).all() & const.isnull()`` formula for byte-for-byte compatibility. Variable.fillna(numeric) now returns a LinearExpression (a constant isn't a variable). Variable.fillna(Variable) stays Variable, as before. Adds 11 tests for §6 propagation (mul/add/sub/div preserve absence, absent-vs-zero distinguishable, present + absent propagates) and §7 resolution (fillna numeric on expr / Variable, present-zero revival). Reclassifies test_masked_variable_model as @pytest.mark.legacy — its assertion "x bound to 10 at masked-y slots" only holds because legacy collapses absent y to 0. The v1 way is x + y.fillna(0) >= 10; a counterpart test in test_legacy_violations.py pins this. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The convention spec names ``reindex`` and ``reindex_like`` among the absence-creating mechanisms (alongside ``mask=``, ``.where()``, ``.shift()``, and ``.unstack()``), but master only had them on ``LinearExpression``. Add them on ``Variable``, with the sentinel fill values (``labels=-1``, ``lower=upper=NaN``) so new positions slot cleanly into §6 propagation. The methods work the same way under both semantics — under legacy the sentinels exist but downstream arithmetic still collapses them back to 0 (the #712 bug), so the user-visible effect of reindex-as- absence only really lands under v1. Adds 5 tests: extend with absent, subset drops, reindex_like with another Variable, and the §4 + §6 hand-off (a reindex-introduced absent flows through ``* 3`` and is visible via ``isnull()``). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Slice C propagated NaN const cleanly but left the storage half-absent after a merge: `(1*x) + xs` at the absent slot kept the `1*x` term's valid coefficient and label even though `const` was NaN there. The §1/§2 promise "absence is one concept, whatever the dtype" only holds if `const.isnull()` at a slot ⇒ every term at that slot has `coeffs = NaN`, `vars = -1`. Add `_absorb_absence(ds)` and call it at the end of `merge` under v1. The constant-operand paths (`_add_constant`, `_apply_constant_op`) don't need explicit absorption — their NaN-propagation naturally preserves the invariant when the input is already v1-compliant (NaN * anything = NaN; dead terms stay dead). Only `merge` opens the gap by concatenating one operand's live term with another operand's absent slot along `_term`. `convention.md` §2 now states the invariant explicitly and introduces the *dead term* terminology, so `fillna(value)` reviving a slot while leaving the sentinel term in place reads as a feature, not a glitch. Adds `test_outer_fillna_then_add_collapses_to_just_added` pinning `(x + y.shift()).fillna(0) + x` — at the previously-absent slot the result has exactly one live term (`1·x[0]`) with `const = 0`, algebraically equal to `x[0]`. At present slots all three terms stay live (`2·x[i] + y[i-1]`), so fillna placement is load-bearing — moving it inside (`x + y.shift().fillna(0) + x`) would double-count `x` at the absent slot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`.add/.sub/.mul/.div/.le/.ge/.eq` already accepted a `join=` argument; this slice's job is just §12's RHS handling under v1. `to_constraint` branches on `options["semantics"]`. Under v1 it skips the legacy `reindex_like(self.const, fill_value=NaN)` step that silently padded a subset RHS, so a coord mismatch with the LHS now flows through `self.sub(rhs)` and gets caught by §8's exact alignment. A NaN in a user-supplied constant RHS raises at construction (§5) — including the PyPSA #1683 case of `min_pu * nominal_fix` with `p_nom=inf` and `p_min_pu=0`. An absent slot in the LHS (propagated from §6) still produces a NaN RHS at that row; downstream auto-mask drops the constraint there, which is exactly §12's "absent slot yields no row." Legacy keeps the old auto-mask path verbatim and adds a `LinopySemanticsWarning` whenever a NaN RHS is observed, so users get the rollout signal without behaviour change. Adds 11 paired tests: TestNamedMethodJoin (inner/outer/left across .add/.mul/.le, plus a "bare op still raises" guard) and TestConstraintRHS (subset RHS raises, NaN RHS raises, PyPSA #1683 on the constraint side, §6→§12 hand-off where the absent LHS slot yields NaN RHS, plus the paired legacy auto-mask documentation and warning-emission tests). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Coming next, in order:
Then a final pass on docs (user-facing migration / rollout — deferred so far). |
Three internal patterns were violating §8 / §11:
1. ``_add_incremental`` in ``linopy/piecewise.py`` builds
``delta_hi <= delta_lo`` from two ``.isel(piece_dim=slice)`` slices
of the same variable. ``drop=True`` is a no-op for slice indexers
so ``piece_dim`` stays on both with *different* labels (first n-1
vs last n-1 of piece_index) — v1 §8 rejects. Relabel the high
slice onto the low slice's labels so the comparison aligns by
label (the explicit-positional path of §10). Same fix for
``binary_hi <= delta_lo``.
2. ``_incremental_weighted`` computes ``bp0 = bp.isel({dim: 0})``
without ``drop=True``, leaving the breakpoint dim as a scalar
coord on the resulting expression. When that expression appears
as the RHS of ``links.eq_expr == ...`` it conflicts with the LHS,
which has no such coord — §11 aux-coord conflict. Add ``drop=True``.
3. ``reformulate_sos2`` builds its first/last constraints from
scalar isels at different positions on ``sos_dim`` (``x``/``M`` at
``n-1`` paired with ``z`` at ``n-2``, etc.). All without
``drop=True``, so the scalar ``sos_dim`` coord differs across
operands — §11 aux-coord conflict. Add ``drop=True`` to all three
sites.
Removes the module-level ``pytestmark = pytest.mark.legacy`` from
``test_piecewise_constraints.py`` and ``test_piecewise_feasibility.py``
and the method-level marks from the two SOS2 multidim tests. Suite is
+598 tests under v1 vs Slice E (legacy → v1 broadened coverage),
0 failures under either semantics.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
§13 falls out of xarray's ``skipna=True`` default; no code changes needed. Adds 4 tests so future drift is caught: sum over a dim, sum without a dim, sum of all-absent (the zero expression), and groupby.sum across heterogeneously-present groups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `_conflicting_aux_coord(datasets)` and wires it into both `merge` and `_align_constant`. When two operands carry an aux coord of the same name with disagreeing values, v1 raises with a pointer to the explicit resolutions (``.drop_vars(...)`` or ``.assign_coords(...)``). xarray silently drops the conflict — the #295 bug — and legacy keeps that behaviour but now emits a `LinopySemanticsWarning`. The helper guards against string-dtype coord values (no `equal_nan=True` there) so the multiindex case keeps working. `_merge_shared_user_coords_differ` refactored to compare bare ``d.indexes[k]`` instead of ``d.coords[k]``: aux coords no longer leak into the §8 check, so §11 owns aux-coord conflicts cleanly and §8 owns dim-coord mismatches with a separate message. Convention §11 expanded from one paragraph: aux coords are validated and propagated but never computed with — they describe the data, they don't enter the math. Goal #4 in `goals.md` picks this up: user-attached auxiliary coordinates are the user's, linopy never silently rewrites them. `test_linear_expression.py::test_merge` adds ``drop=True`` to its ``.sel`` setup — the test was leaving a leftover scalar coord that v1 now correctly catches as a §11 conflict; the fix preserves the test's intent of exercising merge with differing term counts. Conflict-raising tests (TestAuxCoordConflict) cover expr+const, var+var, scalar-isel-without-drop, the ``drop=True`` escape hatch, plus the paired legacy left-wins documentation and warning-emission tests. Propagation guarantees land in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Regression coverage on the half of §11 that wasn't tested before: non-conflicting aux coords carry through every binary operator and into constraints. xarray already preserves them; the tests guard against future drift (e.g. a reduction or helper accidentally dropping a non-dim coord). TestAuxCoordPropagation covers ``3*v``, ``v+5`` (single-operand, fast paths), ``v+v`` with matching aux (the merge path), ``v<=10`` (the constraint path), ``x*a`` / ``x+a`` / ``x/a`` / ``x<=a`` where only the constant DataArray carries the coord (the ``_align_constant`` path), and the var+var case where only one side has the coord. Together: every operator times every "one side / both sides" arrangement, since only conflicts on both sides raise. Runs under both semantics — the legacy behaviour matches the v1 behaviour for the non-conflict cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… solve Fills the convention-coverage gaps surfaced by review of the branch: - §1/§2 dead-term storage invariant: pin that after a merge with an absent slot, coeffs=NaN AND vars=-1, not just const=NaN. The existing propagation tests read through isnull() which only checks const, so a regression in _absorb_absence would have passed them. Multi-operand variant catches binary-only-absorption regressions. - §12 equality: mirror the existing <=/>= TestConstraintRHS coverage for ==. Subset RHS raises, NaN RHS raises, absence in LHS drops the row. - §11 extra operators: add mul-constant and == constraint cases to the existing TestAuxCoordConflict. The class already covered +-constant and var+var; these extend coverage to the other call-site shapes. - §13 scope note: mean/resample/coarsen aren't yet on LinearExpression (tracked in #703); the spec text is the rule those will follow when implemented. Docstring note in TestReductionsSkipAbsent makes this explicit so the gap doesn't read as missing coverage. - End-to-end v1 solve: test_masked_variable_model_v1_drops_constraint pins the v1 outcome at the solver layer — con0 masked at absent slots (solver-independent) and x bound to 0 where the constraint still binds. _v1_fillna_binds confirms the §7 escape hatch recovers the legacy outcome. Catches the regression where v1 silently produces wrong solutions instead of raising. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls the seven v1-specific helpers and the user-NaN message out of ``expressions.py`` and into a dedicated ``linopy/semantics.py`` module — a single home for "what v1 means" that imports cleanly from ``config`` and ``constants`` only. Adds a tiny ``is_v1()`` predicate so the 16 scattered ``options["semantics"] == V1_SEMANTICS`` checks collapse to a one-line call. Helpers (renamed to drop the leading underscore now that they're a real module API): ``check_user_nan_scalar``, ``check_user_nan_array``, ``dim_coords_differ`` (was ``_shared_coords_differ`` — clearer name, matches ``merge_shared_user_coords_differ``), ``merge_shared_user_coords_differ``, ``conflicting_aux_coord``, ``absorb_absence``, plus ``is_v1``. No behaviour change — same checks, same warnings, same raises. The diff is mechanical: imports flipped, two local ``is_v1 = options[...]`` bindings replaced by the imported predicate, one missed ``_USER_NAN_MESSAGE`` reference in ``to_constraint`` routed through ``check_user_nan_array`` for consistency. ``expressions.py`` shrinks by ~105 lines. Future v1-only API surface (e.g. exposing ``is_v1()`` as ``linopy.is_v1()`` for downstream code) and the eventual legacy removal at 1.0 both reduce to deletions of ``semantics.py`` and its import sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three test clusters in ``test_legacy_violations.py`` had near-identical
``test_add_X``, ``test_mul_X``, ``test_div_X`` triples that varied only
by which binary operator they exercised. Collapse each into a single
``@pytest.mark.parametrize("op", ...)`` test:
- TestExactAlignmentConstant: same-size-different-labels and
subset-constant raises, parameterized over add/sub/mul/div.
- TestUserNaNRaises: NaN-DataArray raises over add/sub/mul/div, NaN
scalar over add/sub/mul (div scalar shares the same ``_apply_constant_op``
code path as mul, but ``x / nan`` trips ``__div__``'s unary-negate
TypeError before our check fires; the dispatch needs a separate
fix that's not worth pulling into this refactor).
- TestAbsencePropagation: ``shifted OP scalar`` preserves absence,
parameterized over add/sub/mul/div. Adds a per-op present-slot
value check so the parameterization broadens rather than narrows
the assertion.
Adds a module-level ``_OPS`` dict mapping name → ``operator``
callable so the parameter is the readable name (``"add"``,
``"div"``) while the test still calls the actual operator.
Cuts ~50 lines off ``test_legacy_violations.py`` and makes adding a
new operator a one-line change. Test IDs become e.g.
``test_same_size_different_labels_raises[v1-add]`` — slightly less
self-describing than the explicit-method names but cheap to read.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both methods had v1 and legacy logic interleaved via a ``fillna0`` closure that was identity under v1 and ``da.fillna(0)`` under legacy. Pull them apart into: - ``_add_constant`` / ``_apply_constant_op`` — two-line dispatchers. - ``*_v1`` — v1's implementation, reads as a single coherent story. - ``*_legacy`` — legacy's implementation, ``# LEGACY: remove at 1.0`` marker on each. At 1.0 the removal is mechanical: delete the ``_legacy`` methods and inline the ``_v1`` body into the dispatcher (or rename it back to the public name). Future readers don't have to mentally subtract the legacy branches to understand what v1 does. Add ``LEGACY: remove at 1.0`` marker comments at the other mixed sites in ``expressions.py`` so ``grep`` finds every place that needs touching: ``_align_constant``'s size-aware default fallback, ``to_constraint``'s auto-mask fallthrough, ``LinearExpression.isnull``'s historical AND, and the two warn-on-divergence sites in ``merge``. New ``arithmetics-design/legacy-removal.md`` is the master checklist for the 1.0 cut: every file, function, test, doc edit, and the safe order to do them in. The intent is that the eventual legacy removal takes an afternoon, not a week of grep-archaeology. No behaviour change — same checks, same warns, same raises. Suite is 7282 passed, 0 failures under both semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two distinct CI failures both rooted in the v1 harness commit: 1. **Test collection crash on every linopy/*.py module.** ``test/conftest.py`` imported ``linopy.config`` at module top, which loaded linopy from site-packages before pytest's ``--doctest-modules`` collection walked the source tree. The resulting __file__ mismatch broke all 22 module collections. ``pyproject.toml`` already documents this exact failure mode in the ``filterwarnings`` block. Fix: keep the constant *values* (``"legacy"`` / ``"v1"``) inline in conftest as ``_LEGACY_SEMANTICS`` etc. so the parametrize decorator doesn't force an import, and defer the ``LinopySemanticsWarning`` / ``options`` import into the fixture body. The original import comment in pyproject is now mirrored at the top of conftest. 2. **mypy: 72 "no-untyped-def" errors in test_legacy_violations.py.** The new tests were missing parameter type annotations on the fixture-injected params (``x``, ``xs``, ``op``, ``unsilenced``, ``subset``, ``A``, ``da_aux_B``, ...). ``disallow_untyped_defs`` is set globally, so test files need them too. Filled in the types (``Variable``, ``str``, ``None``, ``xr.DataArray``, ``pd.Index``), added an ``isinstance(result, LinearExpression)`` narrowing in ``test_variable_fillna_zero_revives_slot_as_present_zero`` so mypy can pick the right branch of ``fillna``'s return union. Local: 7282 passed, 0 failures under both semantics; ``mypy .`` Success. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three v1 raises were under-informative — naming the rule violated but not the operand, dim, or values involved. Make each message carry the information the helper already has: - **§5 user-NaN**: the old message conflated the two intents the user might have had — *data error* (fix with ``.fillna(value)``) vs *intended absence* (mark on the variable with ``mask=`` / ``.where`` / ``.reindex`` / ``.shift``). The new message separates them and points each to its own remedy. - **§8 merge mismatch**: rename ``merge_shared_user_coords_differ`` (bool) to ``merge_shared_user_coord_mismatch`` (tuple ``(dim, left, right) | None``). Raise text now includes the offending dim name and both sides' labels (truncated), plus the full set of resolution paths from §10: ``.sel`` / ``.reindex`` / ``.assign_coords`` / ``linopy.align`` / ``join=`` on ``.add`` / ``.sub`` / ``.mul`` / ``.div`` / ``.le`` / ``.ge`` / ``.eq``. - **§11 aux-coord conflict**: ``conflicting_aux_coord`` returns ``(name, left_vals, right_vals) | None``. Raise text includes the coord name, both value snippets, and all three resolution paths (``.drop_vars`` / ``.assign_coords`` / ``isel(drop=True)`` — ``.assign_coords`` was previously omitted). The text is now centralized in ``semantics.py`` so the two raise sites in ``expressions.py`` (``_align_constant`` and ``merge``) share one voice instead of paraphrasing each other. New ``TestErrorMessageContent`` pins the rich content in three tests — that the §5 message names both intents, that the §8 message names the dim and both label lists, and that the §11 message names the coord, both value lists, and lists all three §11 fixes (the ``.assign_coords`` omission would have slipped through ``match= "Auxiliary coordinate"`` substrings). Section references (``§5``, ``§8``, ``§11``) deliberately omitted from user-visible text — spec jargon, not a navigation aid for downstream callers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the small-but-real holes in the §1–§13 coverage map. New tests
mostly, plus one code fix that the test surfaced.
§4 — absence creation
- test_where_creates_absence: §4 names ``.where(cond)`` but only
``mask=`` / ``.reindex`` were tested.
- test_unstack_creates_absence_at_missing_combinations: the
non-rectangular MultiIndex case (``stack`` preserves, ``unstack``
fills) is the asymmetry that earns its own test. Hit a real bug
on the way — ``Variable.unstack`` was producing float NaN in the
integer ``labels`` field instead of the ``FILL_VALUE`` sentinel
(-1), violating §2. Fixed by passing ``fill_value=_fill_value``
to the underlying ``Dataset.unstack`` (same pattern as ``shift``).
Audited the rest of the varwrap calls — only ``shift`` and
``unstack`` introduce new positions; the others either preserve
shape (``assign_*``, ``rename``, ``swap_dims``, ``set_index``,
``roll``, ``stack``), select existing positions (``sel`` /
``isel`` / ``drop_*``), or broadcast existing data without fill
(``broadcast_like``, ``expand_dims``).
- test_data_preserving_methods_do_not_create_absence: parameterized
over ``.roll`` / ``.sel`` / ``.isel``, regression-guards §4's
explicit contrast against the creators.
§10 — named-method join= argument
- test_add_join_override_aligns_positionally: positional-mode is the
surprising one in the join= set; pin it explicitly.
- test_reindex_like_resolves_mismatch_before_bare_op and
test_assign_coords_resolves_mismatch_before_bare_op: §10 names
these as the canonical user fixes; pin that the post-fix bare
operator actually accepts the once-mismatched operand.
§11 — auxiliary-coordinate conflicts
- test_assign_coords_resolves_conflict: §11 lists three escape
hatches; only ``.drop_vars`` / ``isel(drop=True)`` were tested.
- test_multi_operand_merge_aux_conflict_raises: the merge-path
check inspects all operands; a 3-way ``v + w + u`` with the
third disagreeing exercises that.
§12 — constraints follow the same rules
- Parameterize the existing subset / NaN / absence-propagation
tests in ``TestConstraintRHS`` over the three signs (``le`` /
``ge`` / ``eq``) via a new module-level ``_SIGNS`` dispatch.
Folds the previous ``<=`` and ``==`` duplicates together and
fills in ``>=`` for each rule (which was the explicit gap).
The PyPSA #1683 test stays separate — it's tied to ``>=`` by
the real-world case it documents.
Suite: 7303 passed, 515 skipped, 0 failures under both semantics.
``mypy .`` clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ts (#742) * refactor: move conversion/broadcasting/alignment code to linopy/alignment.py Pure move, no behavior change. The new module owns the seam between user input and linopy's labelled arrays: - coords parsing: _coords_to_dict, _as_index, _as_multiindex - conversion: get_from_iterable, pandas_to_dataarray, numpy_to_dataarray, _named_pandas_to_dataarray, fill_missing_coords, as_dataarray - MultiIndex projection: _LevelProjection, _project_onto_multiindex_levels, _warn_implicit_projections - broadcasting: _broadcast_to_coords, broadcast_to_coords - validation: validate_alignment - symmetric alignment: align common.py keeps the general utilities (formatting, label indexes, polars helpers, decorators). Importers (model, expressions, variables, __init__, tests) updated; no re-exports. Follow-up requested by @FabianHofmann in #732. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: restructure alignment tests into test_alignment.py One class per concept in linopy.alignment, mirroring the module's public surface: - TestAsDataarrayFrom{Pandas,Numpy,Scalar,DataArray} + MultiIndexCoords - TestCoordsToDict (the coords-entry naming rules) - TestBroadcastToCoords (strict=False mechanics) - TestMultiIndexProjection (projection values, deprecation warnings, coverage gaps — the legacy/v1 fork point for #717) - TestStrictMode (strict=True contract) - TestValidateAlignment - TestAlign Shared fixtures (mi_index / mi_coords / by_level1) replace the repeated MultiIndex setup; the pandas dims-naming and numpy labeling tests are consolidated into parametrized tables. test_common.py keeps the utility tests. Full suite count unchanged (3202) — no coverage lost. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test: close coverage gaps in alignment.py and the MI coords serialization alignment.py: 97% -> 99%. New edge-case tests: bare-string dims, 0-d arrays, fill_missing_coords type check, partially-named MI levels, gap detection with extra dims, gap-error truncation (>5 missing combinations). The two remaining uncovered lines are defensive branches for inputs outside the DimsLike contract (non-iterable dims). common.py: 88% -> 90%. The MultiIndex round-trip through coords_to_dataset_vars / coords_from_dataset (used by CSRConstraint) had zero coverage; now pinned. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * Update test/test_alignment.py Co-authored-by: Fabian Hofmann <fab.hof@gmx.de> * Update test/test_alignment.py Co-authored-by: Fabian Hofmann <fab.hof@gmx.de> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Fabian Hofmann <fab.hof@gmx.de>
Brings in the coords-as-truth stack (#732), the alignment.py module split (#742), has_terms (#743), and the MatrixAccessor caching (#716). Conflict resolutions (consistent rule: keep this branch's v1/legacy dispatch structure, use master's conversion calls inside it): - expressions.py: _add_constant, _apply_constant_op_{v1,legacy}, and to_constraint use broadcast_to_coords(..., strict=False) instead of as_dataarray; SUPPORTED_CONSTANT_TYPES -> CONSTANT_TYPES. - variables.py: to_linexpr converts via broadcast_to_coords(strict=False), then applies the v1/legacy absence handling. - __init__.py: align from linopy.alignment + LinopySemanticsWarning export. Test adaptations for post-#732 APIs and semantics: - test_legacy_violations: name the MultiIndex coords entry (required since #732). - test_linear_expression: the masked-addend tails of test_nterm and test_variable_names pin legacy absence behavior; split into @pytest.mark.legacy / @pytest.mark.v1 pairs (section 6 divergence). Full suite under both semantics: 6446 passed, 514 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cenario B v1) Closes the two integration points between the alignment layer (#732/#742) and the v1 semantics infrastructure: - _warn_implicit_projections -> _enforce_implicit_projections: under legacy semantics, the implicit MultiIndex-level projection deprecation now goes through warn_legacy() / LinopySemanticsWarning (#738, replacing EvolvingAPIWarning, which stays piecewise-only); under the v1 convention it raises ValueError (sections 8 and 11) — the projection must be written explicitly (scenario B of the #732/#737 discussion). - as_expression no longer swallows the underlying conversion error: "Cannot convert to LinearExpression: <original message>" so the v1 guidance reaches the user. Tests: the MI-projection deprecation tests in test_alignment, test_variable, test_constraint, and test_linear_expression are marked @pytest.mark.legacy and assert LinopySemanticsWarning; each gains a @pytest.mark.v1 counterpart asserting the v1 raise. Full suite under both semantics: 6446 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Note Status update (Claude Code, prompted by @FBumann) — the branch is now current with master and the alignment layer is integrated with the semantics system. Pushed in two commits:
The conflicts were small because this branch never touched
Verification: full suite under both semantics — 6446 passed, 532 skipped, mypy + pre-commit clean. Remaining for this PR (as far as we can tell): the docs/migration-guide follow-up you listed as deferred, and your review of the merge resolutions in |
The convention was silent on inputs indexed by levels of a stacked MultiIndex dimension — the question resolved as scenario B in the #732/#737 discussion. Now written into section 11: - level coords are auxiliary coordinates, so a level-named operand dim is a section-11 conflict: it raises, with the explicit .sel() projection as the documented recipe; - a full reconstruction of the MultiIndex is not a conflict (same coordinate spelled differently, aligns under section 8); - legacy projects implicitly and warns; the projection is removed at 1.0 (added to legacy-removal.md). Also: the #736 TODO no longer claims #732 is unmerged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Claude Code: One scope question surfaced while writing the §11 stacked-MultiIndex spec: whether v1 should carry first-class MultiIndex support at all, or store MI input as flat dims + auxiliary level coords (which §11 already governs, with none of the projection machinery). Filed as #744 with the complexity inventory, a verified flat+aux demo against this branch, and three options. It's your call — but the decision window is before this PR merges, so flagging it now rather than after. |
…ts operand The master merge updated _add_constant_legacy to broadcast_to_coords(strict=False) (it was in the conflict region) but _add_constant_v1 was added by this branch in a region master never touched, so it kept calling as_dataarray — which since #737 is convert-only and no longer broadcasts. A v1 addition of an array operand needing dim expansion failed with "conflicting sizes". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes #736. Stacked on #717 (`feat/arithmetic-convention`). > **Draft:** review-ready and design-final, but the diff is against #717, so it can't merge to master until #717 does. (Not gated by the #744 MultiIndex decision — this PR's size-pairing never touches MultiIndex level coords; its `HELPER_DIMS` usage is `_term` / `_factor`, orthogonal to how MultiIndexes are stored.) Implements the last unimplemented rule of the v1 convention's coordinate-alignment intro: **unlabeled operands pair with the linopy operand's dimensions by size.** ## The rule numpy arrays, lists, and polars Series carry no dimension labels, so their axes adopt the operand's dims **by size** — the same rule for arithmetic operands *and* for `add_variables` / `add_constraints` bounds and masks (no positional carve-out for construction): ```python x = m.add_variables(coords=[a4, time5]) # dims (a: 4, time: 5) (1 * x) + np.arange(5) # length-5 → pairs with `time` (1 * x) @ np.arange(5) # matmul contracts `time`, result keeps `a` ``` Ambiguity raises (v1), pointing at the explicit fix: ```python y = m.add_variables(coords=[p4, q4]) # both dims size 4 (1 * y) + np.arange(4) # ValueError: ... could pair with any of ['p', 'q'] — sizes alone cannot decide. # Wrap the array in an xarray.DataArray with explicit dims. ``` | Case | v1 | legacy | |---|---|---| | unique size match | pair by size | pair positionally (agrees → silent) | | size matches a non-leading dim | pair by size | pair positionally, **warn** (v1 differs) | | ambiguous (shared size / square) | **raise** | pair positionally, **warn** | | no size match | **raise** | pair positionally, **warn** | The rule is uniform: `add_variables(coords=[a:4, time:5], lower=np.arange(5))` now resolves `lower` to `time` (positional pairing errored on `a`); a square/ambiguous bare-numpy bound raises asking for a `DataArray` wrap, instead of silently guessing. ## Implementation (`linopy/alignment.py`) - `_pair_axes_by_size` + `_dims_for_unlabeled_operand` — the size-pairing with the legacy/v1 fork. - **`as_constant`** — normalizes degenerate operands *on entry*: a Python `list` → numpy array (lists have no numeric operators), a 0-d array → Python scalar (takes the scalar fast-path, never pairs). `ConstantLike` stays numeric-only. - `_broadcast_to_coords` gains `unlabeled_pairing="semantic"` for the arithmetic path; explicit-coords callers (`add_variables`) stay positional. - **Two conversion fixes** the seam exposed: `as_dataarray`'s scalar branch and the positional fallback now exclude `HELPER_DIMS`, so a scalar never broadcasts over `_term` / `_factor`. (`HELPER_DIMS` was already the global registry; `_group` is transient and correctly excluded.) ## Why `dims=self.coord_dims` was dropped from the operator calls The arithmetic operators (`expressions.py` `_add_constant`, `_apply_constant_op`, `to_constraint`; `variables.py` `to_linexpr`; both `__matmul__`) previously passed `dims=self.coord_dims` into `broadcast_to_coords`. That argument **pinned an unlabeled array's axes to those dims by position** — which is exactly the behavior #736 replaces. While it's passed, size-pairing can never run (an explicit `dims` means "the caller named the axes"). So every arithmetic call site drops it. This is safe because `coords=self.coords` already carries the same dim information, and `dims` only ever affected *unlabeled* inputs: - **DataArray / named-pandas operands** ignore `dims` entirely (they carry their own labels) — unchanged. - **Unlabeled operands** (numpy / list / polars) now reach `_dims_for_unlabeled_operand` and pair by size — the point of the PR. `dims=self.coord_dims` did do one extra thing worth calling out: it excluded helper dims (`_term`, `_factor`) from the positional labeling. Dropping it surfaced three latent `HELPER_DIMS` leaks (a scalar gaining `_term`, etc.), now fixed in the conversion layer (`as_dataarray`'s scalar branch and the positional fallback both exclude `HELPER_DIMS`). `as_constant` on entry and `UNLABELED_TYPES` as the single dispatch source complete the picture; Variable operators inherit it all via delegation. ## Tests - `TestUnlabeledPairing` (test_legacy_violations) — parametrized over numpy / list / polars, with `@pytest.mark.legacy` / `@pytest.mark.v1` pairs covering pair-by-size, order-independence, ambiguity, no-match, the DataArray escape hatch, and matmul. - The two pre-existing unlabeled-rhs constraint tests forked legacy/v1. - `benchmark_model` example named its rhs axis (it used an ambiguous square-dim rhs — now demonstrates the convention). ## Verification Full suite under both semantics: **6458 passed, 548 skipped**. mypy + pre-commit clean. convention.md §736 TODO resolved; legacy-removal.md updated. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…vention # Conflicts: # linopy/expressions.py
|
Note Generated with Claude Code. v1 convention — what's left. The arithmetic rules are all specified and implemented (#736 was the last). Remaining: one decision, the rollout, the 1.0 cleanup.
§13 reductions: Test cleanup (follow-up):
|
* fix(alignment): align reordered shared-dim coords by label in merge (#550) §8 aligns by label, not position, so the same labels in a different order are the same coordinate. The constant path already reindexed a pure reorder, but the expression-merge path raised under v1 and silently misaligned under legacy. merge() now conforms shared user dims to the first operand's order before the §8/§11 checks, so a reorder reindexes (correct under both semantics) while a differing label set still raises. Aux coords ride along the reindex, so §11 conflicts are preserved. Spec: convention.md §8 now states order-independence explicitly and is retitled "Shared dimensions must carry the same labels" (was framed as xarray's order-sensitive `exact`). Supersedes #550. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * perf(merge): fold reorder-conform and §8 mismatch detection into one pass The reorder fix added a second walk over the shared user dims (one to reindex reordered coords, one to detect a genuine mismatch), duplicating the per-dim .equals() work on every join=None merge — the hot path during model building. conform_merge_dims does both in a single pass and replaces merge_shared_user_coord_mismatch + the separate conform helper. Behaviour is unchanged (full suite 6476 passed under both semantics). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(merge): pin reorder behaviour on multi-operand, quadratic, and MultiIndex paths Regression guards for the §8 by-label alignment beyond the 2-operand case: multi-operand merge([a,b,c]) pairs by label, quadratic merge aligns reordered dims, and a reordered stacked MultiIndex raises (xarray cannot reindex it by tuple — left to §11; tied to the #744 MultiIndex-storage decision). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(merge): align reordered stacked MultiIndex by tuple (resolve §8/§11 gap) A reordered full MultiIndex is "the same coordinate spelled differently" and per §8/§11 should align — but reindex cannot reorder a stacked MI by tuple, so it previously fell through to a confusing §11 aux-coord raise. conform_merge_dims now permutes via positional isel using get_indexer, which works uniformly for a plain index and a MultiIndex's tuples (and get_indexer doubles as the same-set test, replacing the set() comparison — cheaper). A genuinely different label set still raises the §8 mismatch. convention.md §11 now states order-independence for the full-MI case explicitly. Only the MultiIndex case was affected; a plain dim with aux coords already reordered correctly (the aux coord rides along the permute). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(merge): raise dim mismatch before the aux-coord check A shared-dim label mismatch is the root cause, so report it as a dim conflict rather than letting the §11 aux check fire first — a different stacked MultiIndex otherwise surfaced as its level coords conflicting (the wrong message). Aux conflicts still raise once the dims agree. Adds a routing test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(merge): assert via public .indexes, not .coeffs.coords The reorder tests reached through the internal term storage (.coeffs.coords) for coordinates that the expression exposes directly via .indexes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(merge): make reorder-align v1-only; legacy keeps positional + warns Per the transitioning contract, legacy must not change: reordering coords on a shared dim was always positional in expression merges (the constant path, by contrast, has always aligned labelled operands by label — that asymmetry is genuine legacy and is what v1 unifies). conform_merge_dims now permutes only under v1; under legacy it leaves the operands positional and the caller warns with a reorder-specific message (v1 would align by label, a different result). Tests: the align cases are now @pytest.mark.v1; added legacy guards for the positional result and the full warning text. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TestReductionsSkipAbsent only asserted the v1 side (sum/groupby.sum skip absent slots). Add the matching @pytest.mark.legacy tests so the silent result divergence is pinned on both sides: legacy fills the absent slot with 0, so the extra term is counted (sum → 25, group 0 → 10) where v1 skips it (sum → 20, group 0 → 5). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends the §13 work to the other classes where legacy silently computes a different result than v1 (not just raises/warns): - TestAbsencePropagation: legacy fills absent slots with 0 / keeps live terms (to_linexpr, scalar ops, var+var & multi-operand merge, quadratic). - TestFillnaResolves: fillna is a no-op under legacy (slot already 0); outer fillna-then-add double-counts. - TestVariableReindex: reindex-introduced absence collapses to 0. - TestNamedMethodJoin: .le(join="inner") keeps all coords (NaN at gaps). - TestConstraintRHS: absent LHS keeps RHS, constraint not dropped. Verified the all-absent sum is *equal* under both modes (sum-of-none vs sum-of-zeros both → 0), so it carries no legacy twin. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
21 tests were marked @pytest.mark.v1 but assert behaviour that is identical under v1 and legacy. Verified against master (6a5d748) that legacy reproduces master for every one — no accidental behaviour change. Removing the marker lets the autouse `semantics` fixture exercise them under both modes. Covers TestObjectScope (arithmetic identities), TestExactAlignmentMerge (var+var merge already aligns by label on master; only the constant path diverges and stays v1-marked), TestNamedMethodJoin (explicit join=), TestAuxCoordConflict escape hatches, TestFillnaResolves shared cases, and the dataarray-wrapping / fillna-binds tests (dropping their _v1 suffixes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Correction to the previous unmark commit. test_var_plus_var_reordered_ labels_align and test_quadratic_merge_reordered_aligns only asserted .indexes, which is identical under both semantics — so they looked like equal-behaviour tests, but assert_linequal/assert_quadequal show the *variable pairing* genuinely diverges: v1 pairs by label, legacy pairs positionally (preserving master's #550 behaviour). Re-mark both v1, strengthen them to assert the by-label pairing (so they actually verify the #550 fix, not just the index order), and add legacy twins pinning the positional pairing. Found by re-verifying the unmark set with linopy.testing's strict structural comparison instead of trusting each test's own assertions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The three fillna(0)-on-masked tests produce an identical solver model under both semantics, but differ in dead _term padding (v1 leaves the resolved slot's sentinel term with coeff NaN; legacy with 1.0), which assert_linequal/assert_conequal treat as unequal. Adopting strict structural equality as the "equal behaviour" bar, mark them v1: - test_variable_fillna_zero_revives_slot_as_present_zero - test_masked_variable_constraint_via_fillna - test_masked_variable_model_v1_fillna_binds (restored _v1 suffix) Restores their v1-specific docstrings (they document the §7 resolve mechanism, not shared behaviour). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A per-op test that under-asserts (e.g. checks only .indexes) can pass under both semantics while the result silently diverges — that is how the reordered-merge mispairing slipped through. Add a parametrized guard that builds each mode-invariant operation under BOTH semantics and compares with linopy.testing's strict structural helpers (assert_linequal / assert_quadequal / assert_conequal). A regression that makes one of these paths semantics-dependent now fails loudly. Verified it catches the reordered-merge divergence as a negative control. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Note Generated with Claude Code. Quick triage from the PyPSA-Eur side. CI red ( On the remaining checklist:
Test cleanup. The legacy-side pinning point matters most: divergent results (not just raises/warns) are the silent-failure class. Reductions over absent is the obvious one; worth grepping for other §13 / §8 sites where both code paths run clean but disagree. Will rebase the PyPSA-Eur warmstart branch on top once #744 is resolved so we can give the convention real-model exposure before rc. |
…vention # Conflicts: # linopy/alignment.py # test/test_constraint.py
Merging this PR will degrade performance by 14.36%
Warning Please fix the performance issues or acknowledge them on CodSpeed. Performance Changes
Tip Investigate this regression by commenting Comparing Footnotes
|
|
Performance regressions are expected. But we might want to check if we regress even after we strip legacy. |
…pers After merging master, the Constraint.rhs setter accepts a DataArray, so the `# type: ignore` on the MI-level rhs tests is unused (mypy --warn-unused-ignores). Also cast the legacy-violation `_op_*` helpers to their documented runtime type, which the paired assert_linequal/assert_quadequal already verify, so `mypy .` is clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The strict v1 semantic convention for linopy — predictable coordinate alignment and NaN handling.
This is the master PR for the new semantic convention in linopy. It starts with our
Design & transitioning goals, which is carried out in ourNew Semantics spec. Both files are tracked in this branch. WHat you read is the current state.Both might change until this PR is merged
Scope
The convention ships behind
linopy.options["semantics"]—v1opt-in,legacythe default. This PR carries the design, spec, tests and implementation; documentation notebooks follow separately.Testing
All tests in linopy will be executed for both semantics.
Differing behaviour will be tested using
pytest.markers.This will increase ci time temporarily until v1 is released.
Defered to a follow up PR
Docs: Documentation, Migration guide, Release notes