- Notifications
You must be signed in to change notification settings - Fork9k
HADOOP-19522: ABFS: [FnsOverBlob] Rename Recovery Should Succeed When Marker File Exists with Destination Directory#7559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
============================================================ |
} catch(AbfsRestOperationException e) { | ||
if (e.getStatusCode() == HttpURLConnection.HTTP_CONFLICT) { | ||
// If the destination path already exists, then delete the source path. | ||
getAbfsClient().deleteBlobPath(src, null, tracingContext); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Should we validate that destination has the same Etag as the source before assuming that if it already exists, it is safe to delete the source ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
During recovery, if a marker is present in the destination directory, the destination etag is not linked to the source directory's etag (as it may or may not be created with source to dest copy operation). Therefore, comparing the source etag and destination etag may return false. Since we are not losing any data here, so I think it is safe to follow this process here.
* Simulates an error when copying on the 6th call. | ||
*/ | ||
@Test | ||
public void testRenameCopyFailureInBetween() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This function and the one below it, has a lot of common code, can be refactored into a common function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Code refactored
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Production code looks good.
Added some test suggestions and queries
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.javaShow resolvedHide resolved
hadoop-yetus commentedApr 17, 2025
🎊+1 overall
This message was automatically generated. |
hadoop-yetus commentedApr 17, 2025
🎊+1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks for taking comments.
Added some final set of thoughts
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.javaShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.java OutdatedShow resolvedHide resolved
...zure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemRenameRecovery.javaShow resolvedHide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
+1
LGTM
Thanks for taking suggestions
hadoop-yetus commentedApr 17, 2025
🎊+1 overall
This message was automatically generated. |
============================================================ |
… Marker File Exists with Destination Directory (apache#7559)Contributed by Manish BhattReviewed by Anmol Asrani, Manika Joshi, Anuj ModiSigned off by Anuj Modi<anujmodi@apache.org>
JIRA:https://issues.apache.org/jira/browse/HADOOP-19522
On the blob endpoint, since rename is not a direct operation but a combination of two operations—copy and delete—in the case of directory rename, we first rename all the blobs that have the source prefix and, at the end, rename the source to the destination.
In the normal rename flow, renaming is not allowed if the destination already exists. However, in the case of recovery, there is a possibility that some files have already been renamed from the source to the destination. With the recent change (HADOOP-19474 ABFS: [FnsOverBlob] Listing Optimizations to avoid multiple iteration over list response. - ASF JIRA), where we create a marker if the path is implicit, rename recovery will fail at the end when it tries to rename the source to the destination after renaming all the files.
To fix this, while renaming the source to the destination, if we encounter an error indicating that the path already exists, we will suppress the error and mark the rename recovery as successful.