Fix issue with double-opening temp corpus files#5342
Conversation
4881b96 to
d16eca9
Compare
| storage.download_signed_url_to_file(corpus.backup_url, | ||
| temp_zipfile.name) | ||
| storage.download_signed_url_to_file(corpus.backup_url, temp_zipfile) | ||
| with archive.open(temp_zipfile.name) as reader: |
There was a problem hiding this comment.
Not sure, but since temp_zipfile is still held open, wouldn't this line try to reopen it by name and potentially face the same error?
There was a problem hiding this comment.
No because I have changed storage.download_signed_url_to_file() to take a file object and simply write to it.
There was a problem hiding this comment.
I see, but I'm wondering about the very next line: with archive.open(temp_zipfile.name). Won't this try to open the file by name again?
There was a problem hiding this comment.
Ah sorry, I read through your comment in my email, missed which line you were talking about and completely misunderstood!
You are right that trying to read the zip would throw a similar error so I have refactored that bit as well. There wasn't a super clean way to fix that but I'm hoping my solution is the least error-prone way to handle it.
d16eca9 to
eb06917
Compare
It is fine for Linux and Mac bots to create multiple file objects to the same on-disk file but Windows does not allow this and throws a PermissionError. This slightly refactors storage.py to allow passing a filepath or file object when downloading from signed URLs.
eb06917 to
c82c657
Compare
PauloVLB
left a comment
There was a problem hiding this comment.
LGTM
This will be deployed in today’s release, so let’s keep track of the results.
It is fine for Linux and Mac bots to create multiple file objects to the same on-disk file but Windows does not allow this and throws a PermissionError for any attempts to open a file that is already opened.
This slightly refactors
src/clusterfuzz/_internal/google_cloud_utils/storage.py(and its callers) to allow passing a filepath or file object when downloading from signed URLs and changessrc/clusterfuzz/_internal/fuzzing/corpus_manager.pyto use the file-based method.Bug: crbug.com/529346593