Fix socket/fd leak in streaming endpoints (#2766)#3412
Conversation
|
Pushed an update: added |
The streaming generators returned by APIClient (logs, events, attach, stats and raw exec streams) read from a raw socket but never closed the underlying response if the consumer broke out of the loop early, raised, or dropped the generator. Because _get_raw_response_socket keeps a strong reference to the response on the socket, the socket/fd was leaked, producing 'ResourceWarning: unclosed <socket.socket>' and the fd exhaustion reported in docker#2766. Wrap the yield loops of _multiplexed_response_stream_helper and _stream_raw_result, and the generator returned by _read_from_socket (stream=True), in try/finally so the response is closed on early break, exception, .close(), or garbage collection. Python raises GeneratorExit into a suspended generator on close/GC, so finally is the correct hook. Add transport-agnostic regression tests. Closes docker#2766 Signed-off-by: ykstorm <balveer767@gmail.com>
Document that streaming iterators now close their socket/fd on early break, exception, .close(), or GC. Add benchmarks/stream_leak.py, a daemon-free reproduction of docker#2766: opening many streams and stopping each after one chunk leaks every socket on the pre-fix generators and none on the current code. Signed-off-by: ykstorm <balveer767@gmail.com>
7533234 to
3e5cfca
Compare
|
Scoped this PR down to just the leak fix. I removed the optional stream-collector that was bundled in earlier — it added a background thread and a new public API that turn a surgical fix into a design discussion, and the |
Fixes #2766.
The bug
container.logs(stream=True),events(),attach(),stats()and the rawexec streams all hand back a generator that reads from a long-lived socket. If
you stop reading before the stream ends —
breakout of the loop, hit anexception, or just drop the generator — the underlying response was never
closed, so the socket and its fd leaked.
_get_raw_response_socket()even pinsthe response onto the socket (
sock._response = response), so the garbagecollector can't always clean it up either.
_read_from_socket(stream=True)wasexplicit about it in a comment: "the caller is responsible for closing the
response."
In a long-running process that tails logs and bails out early, the fds pile up
until you hit the open-file limit — the
ResourceWarning: unclosed <socket.socket>reports on that issue.The fix
Wrap the yield loop of each streaming generator in
try/finally: response.close(). When you stop iterating, Python raisesGeneratorExitintothe suspended generator (on
.close()and on GC), so thefinallyis the rightplace to release the socket. Socket setup stays lazy — it still happens on the
first read, exactly as before — so nothing about the call semantics changes.
Touched:
_stream_raw_result,_multiplexed_response_stream_helper,_read_from_socketindocker/api/client.py.Showing it actually leaks (and stops leaking)
benchmarks/stream_leak.pyruns without a daemon — it serves an endless chunkedresponse locally, opens N streams, reads one chunk from each and stops early,
then counts how many sockets are still open (client-side, cross-checked with
psutil). Same generator with and without thetry/finally:Tests / docs
tests/unit/stream_leak_test.py(norequestsinternals): early break, exception, and explicit.close()allclose the response.
ruffis clean and the unit suite passes locally. I don't have a daemon onthis machine, so the integration suite is down to CI.
No new dependencies. Default behaviour is unchanged apart from streams now being
closed when you stop reading them.