Commit 90a4c93
authored
MINOR: Fix testRackAwareAssignment flake (#22154)
The last part of testRackAwareAssignment was found to be flaky. This
part moves all topic partitions to different racks and waits for
consumer assignments to settle. Each of the three consumers is expected
to revoke all its partitions and be assigned partitions previously held
by another within a 15 second timeout.
This timeout is not always sufficient. The consumer heartbeat interval
is left at the default of 5,000 ms and each consumer polls every
3,000 ms. In the worst case, it takes a consumer around 7,000 ms to
reconcile an assignment change. An additional 3,000 ms round of polling
may be required when a consumer needs to auto-commit offsets. Two rounds
of reconciliation must happen within 15,000 ms.
The timeline of an example failing run looks like:
-02.956 Group coordinator computes target assignment at epoch=6
consumer0=[0] consumer1=[1, 2] consumer2=[3, 4, 5]
+00.000 15 second timeout starts
+03.179 consumer0 heartbeats This is the first heartbeat since
the rack reassignments. +03.179 Group coordinator computes target
assignment at epoch=7 consumer0=[5] consumer1=[3, 4]
consumer2=[0, 1, 2] +03.186 consumer0 heartbeat receives assignment []
+04.151 consumer1 starts poll() +04.877 consumer1 heartbeats +04.878
consumer1 heartbeat receives assignment [] +05.155 consumer1 ends
poll()
+07.259 consumer1 starts poll() +07.259 consumer1 sends auto-commit
with offsets for [1, 2] +07.288 consumer1 receives auto-commit
response +08.263 consumer1 ends poll()
+10.379 consumer1 starts poll() +10.379 consumer1 calls
onPartitionsRevoked with [1, 2] +10.379 consumer1 calls
onPartitionsAssigned with [] +10.382 consumer1 heartbeats with
owned partitions [] +10.387 consumer1 heartbeat receives assignment
[3, 4] +10.483 consumer1 calls onPartitionsAssigned [3, 4] +10.483
consumer1 heartbeats with owned partitions [3, 4] +11.384 consumer1
ends poll()
+15.000 15 second timeout elapses and the test fails +15.300 consumer2
heartbeat receives assignment [0, 1, 2]
To fix the test we:
* Make config changes to reduce the reconciliation time. This also
reduces the test duration from 60 seconds to 20 seconds.
* Disable auto-commit, since the consumers do not consume any records.
* Reduce the heartbeat interval to 1,000 ms.
* Reduce the poll timeouts to 100 ms, so that polls happen every
300 ms.
* Raise the final timeout to 30 seconds, since under heavy CI load, the
reduced intervals above aren't effective.
Reviewers: Lianet Magrans <lmagrans@confluent.io>, David Jacot
<djacot@confluent.io>1 parent 44bafc6 commit 90a4c93
1 file changed
Lines changed: 20 additions & 13 deletions
File tree
- clients/clients-integration-tests/src/test/java/org/apache/kafka/clients/consumer
Lines changed: 20 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
| 243 | + | |
| 244 | + | |
243 | 245 | | |
244 | 246 | | |
245 | 247 | | |
| |||
251 | 253 | | |
252 | 254 | | |
253 | 255 | | |
| 256 | + | |
| 257 | + | |
254 | 258 | | |
255 | 259 | | |
256 | 260 | | |
| |||
263 | 267 | | |
264 | 268 | | |
265 | 269 | | |
| 270 | + | |
266 | 271 | | |
267 | 272 | | |
268 | 273 | | |
269 | 274 | | |
270 | 275 | | |
| 276 | + | |
271 | 277 | | |
272 | 278 | | |
273 | 279 | | |
274 | 280 | | |
275 | 281 | | |
| 282 | + | |
276 | 283 | | |
277 | 284 | | |
278 | 285 | | |
| |||
288 | 295 | | |
289 | 296 | | |
290 | 297 | | |
291 | | - | |
292 | | - | |
293 | | - | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
294 | 301 | | |
295 | 302 | | |
296 | 303 | | |
| |||
305 | 312 | | |
306 | 313 | | |
307 | 314 | | |
308 | | - | |
309 | | - | |
310 | | - | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
311 | 318 | | |
312 | 319 | | |
313 | 320 | | |
| |||
322 | 329 | | |
323 | 330 | | |
324 | 331 | | |
325 | | - | |
326 | | - | |
327 | | - | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
328 | 335 | | |
329 | 336 | | |
330 | 337 | | |
| |||
346 | 353 | | |
347 | 354 | | |
348 | 355 | | |
349 | | - | |
350 | | - | |
351 | | - | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
352 | 359 | | |
353 | 360 | | |
354 | 361 | | |
355 | | - | |
| 362 | + | |
356 | 363 | | |
357 | 364 | | |
358 | 365 | | |
| |||
0 commit comments