Add optional allowlist to pickle serde deserializer#818
Conversation
Introduces a configurable allowlist of (module, qualname) pairs for the pickle-based deserializer registered by register_type_to_pickle(). When an allowlist is configured, deserialization uses a restricted unpickler that refuses to import classes outside the allowlist and raises pickle.UnpicklingError instead. When no allowlist is configured, behavior remains backward-compatible and a runtime warning is emitted once per call site to encourage adoption. The allowlist can be supplied (a) per-call at deserialize time, (b) per-registration via register_type_to_pickle(cls, allowlist=...), or (c) process-wide via set_pickle_serde_allowlist([...]).
andreahlert
left a comment
There was a problem hiding this comment.
Thanks for this — solid direction (find_class is the right call, back-compat's clean). Just a few notes: only the first one's a blocker (the reg-time allowlist is fail-open), the rest are nits.
| _warned_call_sites: set = set() | ||
|
|
||
|
|
||
| def register_type_to_pickle(cls, allowlist: Optional[Iterable[AllowlistEntry]] = None): |
There was a problem hiding this comment.
this is fail-open. key's static "pickle", so registering another class w/o a list later wipes this one's allowlist. tested it. lvl-2 should just go — per-call + global is enough (#794 already does it that way).
| _global_allowlist: Optional[List[AllowlistEntry]] = None | ||
|
|
||
|
|
||
| def set_pickle_serde_allowlist(allowlist: Optional[Iterable[AllowlistEntry]]) -> None: |
There was a problem hiding this comment.
two setters + two SecurityWarnings now (this and pydantic's). worth merging into one burr.serde someday?
| deserialization. If the persistence backend (SQLite file, Redis, S3, the | ||
| local filesystem, etc.) can be written to by an untrusted party, a tampered | ||
| payload can trigger remote code execution when burr restores application | ||
| state. |
There was a problem hiding this comment.
restricting globals isn't a full fix (cpython docs say so) — softening this + mentioning signed payloads:
| state. | |
| state. | |
| Note: an allowlist restricts *which* globals can be imported — necessary | |
| but not sufficient (see CPython "Restricting Globals"). For fully untrusted | |
| backends, prefer not unpickling, or sign payloads (HMAC) as defense in depth. |
| assert _is_allowed("m", "C", []) is False | ||
|
|
||
|
|
||
| def test_malicious_reduce_payload_blocked_with_allowlist(): |
There was a problem hiding this comment.
only catches a top-level builtins.int. one where an allowed obj nests a blocked class would be a stronger test.
Summary
Adds an optional allowlist to the pickle deserializer used by Burr's serde system (
burr/integrations/serde/pickle.py), mirroring the shape of #794's pydantic allowlist work.Without an allowlist: backward-compatible —
deserialize_picklecontinues to callpickle.loads(...)and emits a one-timeSecurityWarningper registration site so users see the noise.With an allowlist: a
_RestrictedUnpickler(pickle.Unpickler)subclass overridesfind_class(module, name)to validate against the allowlist and raisespickle.UnpicklingErrorfor anything not on it.API
Three ways to set the allowlist, in resolution order (highest priority first):
deserialize_pickle(value, allowlist=[...])register_type_to_pickle(cls, allowlist=[...])set_pickle_serde_allowlist([...])SecurityWarningAllowlist entries are
(module, qualname)tuples — e.g.("myapp.models", "User")— rather than prefix strings. Pickle attacks routinely reach for specific symbols within otherwise-trusted modules (e.g.builtins.eval,os.system,subprocess.Popen), so per-class granularity matters more than for pydantic where the registered name is already a class.Trust model (now in the docstring)
Pickle deserialization is unsafe when the bytes come from a source the application doesn't fully control — including persistence backends that have a separate access model (SQLite files, Redis, S3, filesystems). An attacker with write access to the backend can plant a malicious
__reduce__payload that triggers code execution when state is loaded. The allowlist closes that primitive.Tests
8 tests in
tests/integrations/serde/test_pickle.py(was 1):_is_allowedtruth table for allowlist matching__reduce__payload (calling a sentinel function via__reduce__) is blocked when an allowlist is set; sentinel never runs(module, qualname)is on the allowlistset_pickle_serde_allowlistapplies to fresh registrationsregister_type_to_pickle) overrides module defaultSecurityWarningexactly once per registration siteWider
tests/integrations/serde/+tests/core/test_serde.pyruns 24/24 green.