Skip to content

Data race between lock-free dict.get (_Py_dict_lookup_threadsafe) and dict.clear #151228

@Naserume

Description

@Naserume

Bug report

Bug description:

dict.get reads dict entries through _Py_dict_lookup_threadsafe without taking the dict lock,

cpython/Objects/dictobject.c

Lines 1577 to 1579 in ce916dc

value = _Py_TryXGetRef(&values->values[ix]);
if (value == NULL)
goto read_failed;

so dict.clear can free those entries while the lookup is in progress.

cpython/Objects/dictobject.c

Lines 3035 to 3043 in ce916dc

static void
clear_embedded_values(PyDictValues *values, Py_ssize_t nentries)
{
PyObject *refs[SHARED_KEYS_MAX_SIZE];
assert(nentries <= SHARED_KEYS_MAX_SIZE);
for (Py_ssize_t i = 0; i < nentries; i++) {
refs[i] = values->values[i];
values->values[i] = NULL;
}

Reproducer:

from threading import Thread

class C:
    pass

obj = C()

def thread1():
    for _ in range(20000):
        obj.x = 1
        obj.y = 2
        obj.z = 3
        obj.__dict__.clear()

def thread2():
    for _ in range(20000):
        obj.__dict__.get('x')

threads  = [Thread(target=thread1) for _ in range(1)]
threads += [Thread(target=thread2)  for _ in range(8)]
for t in threads: t.start()
for t in threads: t.join()

TSAN Report:

==================
WARNING: ThreadSanitizer: data race (pid=1539989)
  Write of size 8 at 0x7efa805a19b8 by thread T1:
    #0 clear_embedded_values /cpython/Objects/dictobject.c:3042:27 
    #1 clear_lock_held /cpython/Objects/dictobject.c:3079:9 
    #2 PyDict_Clear /cpython/Objects/dictobject.c:3105:5 
    #3 dict_clear_impl /cpython/Objects/dictobject.c:4833:5 
    #4 dict_clear /cpython/Objects/clinic/dictobject.c.h:170:12 
    #5 _PyEval_EvalFrameDefault /cpython/Python/generated_cases.c.h:4142:35
...

  Previous atomic read of size 8 at 0x7efa805a19b8 by thread T2:
    #0 _Py_atomic_load_ptr /cpython/./Include/cpython/pyatomic_gcc.h:300:18 
    #1 _Py_TryXGetRef /cpython/./Include/internal/pycore_object.h:610:23 
    #2 _Py_dict_lookup_threadsafe /cpython/Objects/dictobject.c:1577:25 
    #3 dict_get_impl /cpython/Objects/dictobject.c:4667:10 
    #4 dict_get /cpython/Objects/clinic/dictobject.c.h:110:20
    #5 method_vectorcall_FASTCALL /cpython/Objects/descrobject.c:402:24 
...

SUMMARY: ThreadSanitizer: data race /cpython/Objects/dictobject.c:3042:27 in clear_embedded_values
==================

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions