Skip to content

Add optional allowlist to pickle serde deserializer#818

Open
elijahbenizzy wants to merge 2 commits into
mainfrom
improve/pickle-serde-allowlist
Open

Add optional allowlist to pickle serde deserializer#818
elijahbenizzy wants to merge 2 commits into
mainfrom
improve/pickle-serde-allowlist

Conversation

@elijahbenizzy

Copy link
Copy Markdown
Contributor

Summary

Adds an optional allowlist to the pickle deserializer used by Burr's serde system (burr/integrations/serde/pickle.py), mirroring the shape of #794's pydantic allowlist work.

Without an allowlist: backward-compatible — deserialize_pickle continues to call pickle.loads(...) and emits a one-time SecurityWarning per registration site so users see the noise.

With an allowlist: a _RestrictedUnpickler(pickle.Unpickler) subclass overrides find_class(module, name) to validate against the allowlist and raises pickle.UnpicklingError for anything not on it.

API

Three ways to set the allowlist, in resolution order (highest priority first):

  1. Per-call kwarg: deserialize_pickle(value, allowlist=[...])
  2. At registration time: register_type_to_pickle(cls, allowlist=[...])
  3. Process-wide: set_pickle_serde_allowlist([...])
  4. Otherwise: legacy unsafe path + SecurityWarning

Allowlist entries are (module, qualname) tuples — e.g. ("myapp.models", "User") — rather than prefix strings. Pickle attacks routinely reach for specific symbols within otherwise-trusted modules (e.g. builtins.eval, os.system, subprocess.Popen), so per-class granularity matters more than for pydantic where the registered name is already a class.

Trust model (now in the docstring)

Pickle deserialization is unsafe when the bytes come from a source the application doesn't fully control — including persistence backends that have a separate access model (SQLite files, Redis, S3, filesystems). An attacker with write access to the backend can plant a malicious __reduce__ payload that triggers code execution when state is loaded. The allowlist closes that primitive.

Tests

8 tests in tests/integrations/serde/test_pickle.py (was 1):

  • _is_allowed truth table for allowlist matching
  • Malicious __reduce__ payload (calling a sentinel function via __reduce__) is blocked when an allowlist is set; sentinel never runs
  • Legitimate object roundtrips when its (module, qualname) is on the allowlist
  • Module-level set_pickle_serde_allowlist applies to fresh registrations
  • Instance-level allowlist (at register_type_to_pickle) overrides module default
  • Per-call allowlist kwarg overrides both
  • Legacy path (no allowlist) still works and emits the SecurityWarning exactly once per registration site

Wider tests/integrations/serde/ + tests/core/test_serde.py runs 24/24 green.

Introduces a configurable allowlist of (module, qualname) pairs for the
pickle-based deserializer registered by register_type_to_pickle().  When
an allowlist is configured, deserialization uses a restricted unpickler
that refuses to import classes outside the allowlist and raises
pickle.UnpicklingError instead.  When no allowlist is configured,
behavior remains backward-compatible and a runtime warning is emitted
once per call site to encourage adoption.

The allowlist can be supplied (a) per-call at deserialize time, (b)
per-registration via register_type_to_pickle(cls, allowlist=...), or (c)
process-wide via set_pickle_serde_allowlist([...]).
@github-actions github-actions Bot added the area/integrations External integrations (LLMs, frameworks) label Jun 22, 2026

@andreahlert andreahlert left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this — solid direction (find_class is the right call, back-compat's clean). Just a few notes: only the first one's a blocker (the reg-time allowlist is fail-open), the rest are nits.

_warned_call_sites: set = set()


def register_type_to_pickle(cls, allowlist: Optional[Iterable[AllowlistEntry]] = None):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fail-open. key's static "pickle", so registering another class w/o a list later wipes this one's allowlist. tested it. lvl-2 should just go — per-call + global is enough (#794 already does it that way).

_global_allowlist: Optional[List[AllowlistEntry]] = None


def set_pickle_serde_allowlist(allowlist: Optional[Iterable[AllowlistEntry]]) -> None:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two setters + two SecurityWarnings now (this and pydantic's). worth merging into one burr.serde someday?

deserialization. If the persistence backend (SQLite file, Redis, S3, the
local filesystem, etc.) can be written to by an untrusted party, a tampered
payload can trigger remote code execution when burr restores application
state.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restricting globals isn't a full fix (cpython docs say so) — softening this + mentioning signed payloads:

Suggested change
state.
state.
Note: an allowlist restricts *which* globals can be importednecessary
but not sufficient (see CPython "Restricting Globals"). For fully untrusted
backends, prefer not unpickling, or sign payloads (HMAC) as defense in depth.

assert _is_allowed("m", "C", []) is False


def test_malicious_reduce_payload_blocked_with_allowlist():

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only catches a top-level builtins.int. one where an allowed obj nests a blocked class would be a stronger test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/integrations External integrations (LLMs, frameworks)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants