HDDS-14661. Handle split table writes for Multipart Upload#10588
HDDS-14661. Handle split table writes for Multipart Upload#10588spacemonkd wants to merge 1 commit into
Conversation
|
@errose28 @ivandika3 @kerneltime could you please take a look at this PR? Having both of these ready to merge so we can do it in quick succession would be great! |
ivandika3
left a comment
There was a problem hiding this comment.
Thanks @spacemonkd for the patch. Overall looks good. Left some comments.
Let's get this in asap, this feature has been outstanding quite a while.
| /** | ||
| * Cache-aware scanner for multipart parts table rows. | ||
| */ | ||
| public final class MultipartPartScanUtil { |
There was a problem hiding this comment.
Nit: I'd suggest to move this to OMMultipartUploadUtils so we only need a single util class for MPU.
| try (TableIterator<OmMultipartPartKey, | ||
| ? extends Table.KeyValue<OmMultipartPartKey, OmMultipartPartInfo>> | ||
| iterator = omMetadataManager.getMultipartPartsTable().iterator(prefix)) { | ||
| while (iterator.hasNext()) { | ||
| Table.KeyValue<OmMultipartPartKey, OmMultipartPartInfo> kv = | ||
| iterator.next(); | ||
| if (kv == null) { | ||
| continue; | ||
| } | ||
| OmMultipartPartKey key = kv.getKey(); | ||
| if (!uploadId.equals(key.getUploadId())) { | ||
| break; | ||
| } | ||
| if (key.hasPartNumber()) { | ||
| parts.put(key.getPartNumber(), kv.getValue()); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| Iterator<Map.Entry<CacheKey<OmMultipartPartKey>, | ||
| CacheValue<OmMultipartPartInfo>>> cacheIterator = | ||
| omMetadataManager.getMultipartPartsTable().cacheIterator(); | ||
| while (cacheIterator.hasNext()) { | ||
| Map.Entry<CacheKey<OmMultipartPartKey>, CacheValue<OmMultipartPartInfo>> | ||
| cacheEntry = cacheIterator.next(); | ||
| OmMultipartPartKey key = cacheEntry.getKey().getCacheKey(); | ||
| if (!uploadId.equals(key.getUploadId()) || !key.hasPartNumber()) { | ||
| continue; | ||
| } | ||
| OmMultipartPartInfo value = cacheEntry.getValue().getCacheValue(); | ||
| if (value == null) { | ||
| parts.remove(key.getPartNumber()); | ||
| } else { | ||
| parts.put(key.getPartNumber(), value); | ||
| } | ||
| } | ||
|
|
||
| return parts; |
There was a problem hiding this comment.
Usually, in OmMetadataReader listKeys and getMultipartUploadKeys, we will iterate the cache first before the DB. Any reason here is reversed? Not saying it's wrong, just want to standardize.
| if (multipartKeyInfo.getSchemaVersion() | ||
| == OmMultipartKeyInfo.LEGACY_SCHEMA_VERSION | ||
| && null != oldPartKeyInfo) { |
There was a problem hiding this comment.
Nit: A lot of unnecessary line wraps here which AI agent usually does. We can try to fix unnecessary wraps introduced in this patch as much as possible (within reason).
| private OmMultipartPartKey getMultipartPartKey(String uploadId, | ||
| int partNumber) { | ||
| return OmMultipartPartKey.of(uploadId, partNumber); | ||
| } |
There was a problem hiding this comment.
Nit: Let's make this inline instead of introducing another method.
| private void validateSplitPartInfo(OmKeyInfo omKeyInfo, int partNumber) | ||
| throws OMException { | ||
| if (StringUtils.isBlank(omKeyInfo.getMetadata().get(OzoneConsts.ETAG))) { | ||
| throw new OMException("Missing ETag for multipart upload part " | ||
| + partNumber, OMException.ResultCodes.INVALID_REQUEST); | ||
| } | ||
| if (omKeyInfo.getKeyLocationVersions() == null | ||
| || omKeyInfo.getKeyLocationVersions().isEmpty() | ||
| || omKeyInfo.getLatestVersionLocations().getLocationList().isEmpty()) { | ||
| throw new OMException("Missing block locations for multipart upload part " | ||
| + partNumber, OMException.ResultCodes.INVALID_REQUEST); | ||
| } | ||
| } |
There was a problem hiding this comment.
Do we have this logic in the schema version 0? OM might allow uploading an empty part (although S3G might fail it early).
| if (eTagBasedValidationAvailable) { | ||
| requestPartId = part.getETag(); | ||
| omPartId = multipartPartInfo.getETag(); | ||
| } else { | ||
| requestPartId = part.getPartName(); | ||
| omPartId = multipartPartInfo.getPartName(); | ||
| } | ||
| requestPart = new MultipartCommitRequestPart( | ||
| requestPartId, omPartId, StringUtils.equals(requestPartId, omPartId)); |
There was a problem hiding this comment.
Are we going to call the ETag validation here?
| for (OmKeyInfo currentKeyPartInfo : | ||
| abortInfo.getPartsKeyInfoToDelete()) { | ||
| addPartToDeletedTable(omMetadataManager, batchOperation, | ||
| omBucketInfo, abortInfo, currentKeyPartInfo, | ||
| omMultipartKeyInfo.getUpdateID()); | ||
| } | ||
|
|
||
| // multi-part key format is volumeName/bucketName/keyName/uploadId | ||
| String deleteKey = omMetadataManager.getOzoneDeletePathKey( | ||
| currentKeyPartInfo.getObjectID(), abortInfo.getMultipartKey()); | ||
|
|
||
| omMetadataManager.getDeletedTable().putWithBatch(batchOperation, | ||
| deleteKey, repeatedOmKeyInfo); | ||
| for (OmMultipartPartKey partKey : | ||
| abortInfo.getPartsTableKeysToDelete()) { | ||
| omMetadataManager.getMultipartPartsTable().deleteWithBatch( | ||
| batchOperation, partKey); | ||
| } |
There was a problem hiding this comment.
Just a thought, but if it's possible we can use deleteRange in the future to reduce the tombstones. However, deleteRange need to be done carefully since it can invalidate valid DB entries.
| this.multipartPartKey = multipartPartKey; | ||
| this.openKey = openKey; | ||
| this.omMultipartKeyInfo = omMultipartKeyInfo; | ||
| this.omMultipartPartInfo = omMultipartPartInfo; |
There was a problem hiding this comment.
We might want to add a precondition here that if mutlipartPartKey and omMultipartPartInfo need to either be both null or both not null.
What changes were proposed in this pull request?
HDDS-14661. Handle split table writes for Multipart Upload
Please describe your PR in detail:
Implements the runtime support for schemaVersion 1 multipart uploads using the split
multipartPartsTable.This patch covers the MPU lifecycle after an upload is already marked as schemaVersion 1:
multipartPartsTableinstead of growingMultipartInfoTable.multipartPartsTable, assembles the final key, and deletes all consumed part rows.multipartPartsTable.This patch intentionally does not decide when new uploads become schemaVersion 1.
PR #10062 owns the upgrade/finalization and initiate-MPU schema selection. Once that branch starts creating schemaVersion 1 MPUs, this patch provides the runtime read/write/cleanup behavior needed to handle them end to end.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-14661
How was this patch tested?
Patch was tested using unit tests