Skip to content

HDDS-15492. Support OM follower read for gRPC client#10591

Open
echonesis wants to merge 2 commits into
apache:masterfrom
echonesis:HDDS-15492
Open

HDDS-15492. Support OM follower read for gRPC client#10591
echonesis wants to merge 2 commits into
apache:masterfrom
echonesis:HDDS-15492

Conversation

@echonesis

@echonesis echonesis commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

This PR adds OM follower read support for the gRPC OM transport.

Before this change, OM follower read support was available through the Hadoop RPC transport path, but gRPC clients still used the leader-oriented path for OM requests. This PR extends GrpcOmTransport so follower-read eligible OM requests can be sent to follower OM nodes when follower read is enabled.

The change includes:

  • Applying the configured default read consistency hint for gRPC OM requests.
  • Routing follower-read eligible read requests to non-leader OM nodes.
  • Falling back to the leader path when follower read cannot be served by a follower.
  • Reusing OM failover exception handling for follower read failures.
  • Configuring per-OM gRPC ports in the HA mini cluster.
  • Adding unit and integration coverage for gRPC follower read behavior.

Generated-by: OpenAI Codex

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15492

How was this patch tested?

Tested with:

mvn clean test -pl :ozone-common,:ozone-integration-test -am \
  -Dtest=TestS3GrpcOmTransport,TestOzoneManagerHAWithAllRunning,TestOzoneManagerHAWithStoppedNodes,TestOzoneManagerHAFollowerReadWithAllRunning,TestOzoneManagerHAFollowerReadWithStoppedNodes,TestOzoneManagerPrepare \
  -DskipShade -DskipRecon -DskipDocs

GitHub Actions CI: https://github.com/echonesis/ozone/actions/runs/28155889076

@ivandika3 ivandika3 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @echonesis , just saw it briefly, left one comment about the test. Will review more later.

@echonesis echonesis requested a review from ivandika3 June 25, 2026 09:40

@ivandika3 ivandika3 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @echonesis , overall LGTM. Just left a few comments.

It'll be good if you can test this on your cluster after this is merged.

Comment on lines +224 to +227
if (isCurrentLeaderNode(nodeId)) {
changeFollowerReadProxy(nodeId);
continue;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we don't have this logic in the Hadoop RPC follower read since we would like to allow read on leader. For parity purpose, let's remove this and add them together if we want to introduce a follower only strategy.

inetAddress.getHostName())
.run(() -> resp.set(clients.get(host.get())
.submitRequest(payload)));
resp.set(submitRequestToHost(payload, host.get()));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two AtomicReference set here and in the submitRequestToHost. Can we use the return value of submitRequestToHost directly?

}

@Test
public void testFollowerReadSkipsKnownLeader() throws Exception {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test then can be changed so that if the initial OM is a follower, it will not trigger failover.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants