-
Notifications
You must be signed in to change notification settings - Fork 15.3k
KAFKA-20514: kraft observers should use previous fetch response to decide where to send the next fetch #22111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Changes from all commits
8d99a03
a4ee904
48bb14b
325e11b
9d0e220
0bce334
d20007b
c5319ef
2bc3f54
7ab818a
493e2e0
4c72c98
4e91f4a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -38,6 +38,7 @@ | |
|
|
||
| import java.nio.ByteBuffer; | ||
| import java.util.List; | ||
| import java.util.Map; | ||
| import java.util.Optional; | ||
| import java.util.OptionalInt; | ||
| import java.util.OptionalLong; | ||
|
|
@@ -765,4 +766,122 @@ void testUpdatedHighWatermarkCompleted() throws Exception { | |
| assertEquals(localLogEndOffset, partitionResponse.highWatermark()); | ||
| } | ||
| } | ||
|
|
||
| @Test | ||
| void testObserverFetchesBetweenLeaderAndBootstrapServers() throws Exception { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this fail without your change? If so, can you tell me exactly what fails?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, Looking at this again after a while, the for loop actually made this harder for me to read. I'm going to remove it.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ran this test against this PR and I got this trace: Can we get a TRACE of the actual issue to make sure we are solving the correct problem? I am having a hard time understanding the actual problem so I am sure that this change solves that problem.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here is an updated trace from my most recent local changes: The scenario is: after the local node becomes |
||
| final var epoch = 2; | ||
| final var local = KafkaRaftClientTest.replicaKey( | ||
| KafkaRaftClientTest.randomReplicaId(), | ||
| true | ||
| ); | ||
| final var leader = KafkaRaftClientTest.replicaKey(local.id() + 1, true); | ||
| final var bootstrapVoter = KafkaRaftClientTest.replicaKey(local.id() + 2, true); | ||
| final var voters = VoterSet.fromMap( | ||
| Map.of( | ||
| leader.id(), VoterSetTest.voterNode(leader), | ||
| bootstrapVoter.id(), VoterSetTest.voterNode(bootstrapVoter) | ||
| ) | ||
| ); | ||
|
|
||
| final var context = new RaftClientTestContext.Builder( | ||
| local.id(), | ||
| local.directoryId().get() | ||
| ) | ||
| .withStartingVoters(voters, KRaftVersion.KRAFT_VERSION_1) | ||
| // configure the bootstrap servers to only include the bootstrap voter | ||
| // to reliably check the destination of the observer's fetch requests | ||
| // alternates between the leader and the bootstrap voter | ||
| .withBootstrapServers( | ||
| Optional.of(List.of(RaftClientTestContext.mockAddress(bootstrapVoter.id()))) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's document that you are doing this to reliably check fetches to the leader (known node) vs fetches to the bootstrap server (unknown nodes). Another way to check this is that "bootstrap nodes" have an unknown id. We represent this in the network client by giving those nodes an id less than -1. RPCs to known kafka nodes have an id greater than or equal to 0. |
||
| ) | ||
| .withRaftProtocol(RaftClientTestContext.RaftProtocol.KIP_1186_PROTOCOL) | ||
| .build(); | ||
|
|
||
| // The observer initially fetches from the bootstrap servers, | ||
| // where it will discover the leader's endpoints. | ||
| final var bootstrapFetch = pollAndCheckObserverFetchRequest( | ||
| context, | ||
| true, | ||
| bootstrapVoter.id() | ||
| ); | ||
| context.deliverResponse( | ||
| bootstrapFetch.correlationId(), | ||
| bootstrapFetch.destination(), | ||
| context.fetchResponse( | ||
| epoch, | ||
| leader.id(), | ||
| MemoryRecords.EMPTY, | ||
| 0L, | ||
| Errors.NOT_LEADER_OR_FOLLOWER | ||
| ) | ||
| ); | ||
|
|
||
| // Subsequent fetch from the observer is sent to the leader | ||
| // Return a BROKER_NOT_AVAILABLE error, handle that response, and then | ||
| // advance time past the fetch timeout. | ||
| // This is to simulate the leader endpoints being unreachable, which will | ||
| // cause the observer to fetch from the bootstrap servers after the fetch timeout expires. | ||
| final var leaderFetch = pollAndCheckObserverFetchRequest( | ||
| context, | ||
| false, | ||
| leader.id() | ||
| ); | ||
| context.deliverResponse( | ||
| leaderFetch.correlationId(), | ||
| leaderFetch.destination(), | ||
| RaftUtil.errorResponse( | ||
| ApiKeys.FETCH, | ||
| Errors.BROKER_NOT_AVAILABLE | ||
| ) | ||
| ); | ||
|
Comment on lines
+829
to
+836
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to the trace I pasted, this response is not deliver before the fetch timeout. This is misleading when reading the test. |
||
| context.client.poll(); | ||
|
|
||
| // The fetch timeout is much greater than the request manager's configured backoff, so the | ||
| // current unreachable connection will no longer be backing off when the next fetch is sent. | ||
| // Expire the fetch timeout and check that the next fetch is sent to the bootstrap server again. | ||
| context.time.sleep(context.fetchTimeoutMs + 1); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made this changes and the test pass. It looks like the issue is that the leader is in the backoff state because kraft got an error from the leader: diff --git a/raft/src/test/java/org/apache/kafka/raft/KafkaRaftClientFetchTest.java b/raft/src/test/java/org/apache/kafka/raft/KafkaRaftClientFetchTest.java
index 8d762d6c96..3d873add30 100644
--- a/raft/src/test/java/org/apache/kafka/raft/KafkaRaftClientFetchTest.java
+++ b/raft/src/test/java/org/apache/kafka/raft/KafkaRaftClientFetchTest.java
@@ -836,7 +836,6 @@ public final class KafkaRaftClientFetchTest {
// The fetch timeout is much greater than the request manager's configured backoff, so the
// current unreachable connection will no longer be backing off when the next fetch is sent.
// Expire the fetch timeout and check that the next fetch is sent to the bootstrap server again.
- context.time.sleep(context.fetchTimeoutMs + 1);
final var nextBootstrapFetch = pollAndCheckObserverFetchRequest(
context,
true,
@@ -854,6 +853,8 @@ public final class KafkaRaftClientFetchTest {
)
);
+ context.time.sleep(context.retryBackoffMs);
+
// Discovering the leader from a bootstrap fetch means the observer resumes fetching from the leader
pollAndCheckObserverFetchRequest(
context,
@@ -871,10 +872,8 @@ public final class KafkaRaftClientFetchTest {
context.pollUntilRequest();
RaftRequest.Outbound fetchRequest = context.assertSentFetchRequest();
if (isBootstrapFetch) {
- assertTrue(context.client.quorum().isUnattached());
assertTrue(fetchRequest.destination().id() < -1);
} else {
- assertTrue(context.client.quorum().isFollower());
assertEquals(expectedDestinationId, fetchRequest.destination().id());
}
// only need to check port since the host is always "localhost" for the mock addresses
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My test is written incorrectly. I agree with your above comments that the response from the leader fetch is not delivered until after the fetch timeout expires. I need to handle the |
||
| final var nextBootstrapFetch = pollAndCheckObserverFetchRequest( | ||
| context, | ||
| true, | ||
| bootstrapVoter.id() | ||
| ); | ||
| context.deliverResponse( | ||
| nextBootstrapFetch.correlationId(), | ||
| nextBootstrapFetch.destination(), | ||
| context.fetchResponse( | ||
| epoch, | ||
| leader.id(), | ||
| MemoryRecords.EMPTY, | ||
| 0L, | ||
| Errors.NOT_LEADER_OR_FOLLOWER | ||
| ) | ||
| ); | ||
|
|
||
| // Discovering the leader from a bootstrap fetch means the observer resumes fetching from the leader | ||
| pollAndCheckObserverFetchRequest( | ||
| context, | ||
| false, | ||
| leader.id() | ||
| ); | ||
| } | ||
|
|
||
| private RaftRequest.Outbound pollAndCheckObserverFetchRequest( | ||
| RaftClientTestContext context, | ||
| boolean isBootstrapFetch, | ||
| int expectedDestinationId | ||
| ) throws Exception { | ||
| context.pollUntilRequest(); | ||
| RaftRequest.Outbound fetchRequest = context.assertSentFetchRequest(); | ||
| if (isBootstrapFetch) { | ||
| assertTrue(fetchRequest.destination().id() < -1); | ||
| } else { | ||
| assertEquals(expectedDestinationId, fetchRequest.destination().id()); | ||
| } | ||
| // only need to check port since the host is always "localhost" for the mock addresses | ||
| assertEquals( | ||
| RaftClientTestContext.mockAddress(expectedDestinationId).getPort(), | ||
| fetchRequest.destination().port() | ||
| ); | ||
| return fetchRequest; | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2010,7 +2010,8 @@ public void testFollowerAsObserverDoesNotBecomeProspectiveAfterFetchTimeout(bool | |
|
|
||
| context.time.sleep(context.fetchTimeoutMs); | ||
| context.pollUntilRequest(); | ||
| assertTrue(context.client.quorum().isFollower()); | ||
| assertFalse(context.client.quorum().isProspective()); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: do we with to add an assert for whether it becomes
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added. |
||
| assertTrue(context.client.quorum().isUnattached()); | ||
|
|
||
| // transitions to unattached | ||
| context.deliverRequest(context.voteRequest(epoch + 1, replicaKey(otherNodeId, withKip853Rpc), epoch, 1)); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a test which explicitly checks that the observer can transition to
unattachedif there is a timeout.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this to the existing unit test.