Skip to content

Adding __reduce__ method to QuadPrecision for supporting pickle#105

Open
SwayamInSync wants to merge 6 commits into
numpy:mainfrom
SwayamInSync:fix-pickle-scalar-99
Open

Adding __reduce__ method to QuadPrecision for supporting pickle#105
SwayamInSync wants to merge 6 commits into
numpy:mainfrom
SwayamInSync:fix-pickle-scalar-99

Conversation

@SwayamInSync

@SwayamInSync SwayamInSync commented May 19, 2026

Copy link
Copy Markdown
Member

closes #99

Fixing exp_digits=4 (since float128 can have max exponent of 4932) at each place of Dragon4 call to keep consistency in codebase

@SwayamInSync SwayamInSync requested a review from ngoldbaum June 5, 2026 06:46

@ngoldbaum ngoldbaum left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked claude to review this and it spotted some issues (unverified by me):

  • The parsing relies on Sleef_strtoq, which Claude identified as not being lossless for the decimal case: sleef/src/quad/sleefsimdqp.c:3504-3546. Only hex-encoded values are losslessly converted.
  • You chose to serialize to the shortest representable form, but that leaves no slack for rounding error.

Elsewhere in the codebase you use quad_to_string_same_value_check to guard against this possibility but didn't use it in the new __reduce__ codepath, so it's possible to write values to disk that can't be round-tripped.

The new tests don't check places where the rounding error would be particularly bad: subnormals like 6.5e-4966 and values near the maximum representable value.

It also found a pre-existing issue in the dragon4 implementation: the hasUnequalMargins flag is computed after the mantissa bit it is OR'd in, so it's always false. It suggests this instead:

diff --git a/src/csrc/dragon4.c b/src/csrc/dragon4.c
index f86f269..405d751 100644
--- a/src/csrc/dragon4.c
+++ b/src/csrc/dragon4.c
@@ -1925,7 +1925,7 @@ Dragon4_PrintFloat_Sleef_quad(Sleef_quad *value, Dragon4_>
     /* factor the value into its parts */
     if (floatExponent != 0) {
         /* normal */
-        mantissa_hi = (1ull << 48) | mantissa_hi;
+        mantissa_hi = (1ull << 48) && mantissa_lo == 0;
         /* mantissa_lo is unchanged */
         exponent = floatExponent - 16383 - 112;
         mantissaBit = 112;

That's a pre-existing issue though and I haven't confirmed it.

It also generated this script which finds values that don't round-trip using several different methods: https://gist.github.com/ngoldbaum/0bbe4ed5d2479a76885b3f3c2ed7233a#file-rt_hunt-py

Comment thread src/csrc/casts.cpp
Comment thread src/csrc/scalar.c Outdated
Comment thread tests/test_quaddtype.py Outdated
Comment thread tests/test_quaddtype.py Outdated
Comment thread tests/test_quaddtype.py Outdated
@SwayamInSync

Copy link
Copy Markdown
Member Author

The dragon4 roundtrip issue was actually real (means might have to update NumPy upstream too)

Reason this kept ignored even having roundtrip test-cases is because none of them purely targets the power of 2s. We only checked the subnormal, max and in between ones.
This was a really good find.

@SwayamInSync

Copy link
Copy Markdown
Member Author

Let me merge #104 since it is approved and update here

@SwayamInSync

Copy link
Copy Markdown
Member Author

Hey @ngoldbaum if you get some time have an eye on this again. I mostly resolved all the reviews and the discovered bug.

@ngoldbaum

Copy link
Copy Markdown
Member

I think there's still one issue that isn't resolved from my last round of review:

Elsewhere in the codebase you use quad_to_string_same_value_check to guard against this possibility but didn't use it in the new reduce codepath, so it's possible to write values to disk that can't be round-tripped.

IMO it's worth adding this since the other paths have the same check.

Other than that, LGTM.

@SwayamInSync

SwayamInSync commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

Oh yes, I missed that, I was looking at the cases that fails Sleef_strtoq and forgot to add the guard. The error hit possibility rare but with a targetted fuzz, I was able to hit it.
Sleef_strtoq's broken decimal path is fixable but the fix will lead to a relatively slower path, won't impact much as this will only execute for certain rare cases.

So I was thinking 2 options

  1. we can add this in numpy-quaddtype codebase in NumPyOS_ascii_strtoq
  2. Keep a maintained fork of SLEEF for quaddtype (this because upstream SLEEF rarely accepts updates, Sleef_strtoq incorrectly sets the endptr fix isn't landed yet and we handled that in here + FMA path dispatch) + Huge amount of custom meson patches were needed in the BUILD: Add Pyodide CI and build recipes #66 to build for pyodide build.

@ngoldbaum

Copy link
Copy Markdown
Member

I think avoiding maintaining a bunch of SLEEF patches is probably better long-term. Up to you though.

That said, another thing that should work is parsing hex-encoded strings. That should be exact and round-trippable. At least according to what Claude saw when I first reviewed your PR, I didn't double-check that claim.

Also it occurs to me that at least for pickling and other binary serialization, you could also simply dump the 128 bit float as e.g. a char[16] buffer and then read that back in. You'd have to worry about endianness but I think that's it?

Of course round-tripping through strings is something you want for other reasons and it would be nice to fix that.

@SwayamInSync

SwayamInSync commented Jun 9, 2026

Copy link
Copy Markdown
Member Author

Right call, numpy also pickles raw bytes + dtype so I should keep that path for __reduce__ impl (although we don't support quad construction from bytes, so I will also add the same here). This should always be roundtrip valid.

Will start a new thread for the proper string round-tripping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Pickle of scalar QuadPrecision fails on loads

2 participants