Skip to content

[FLINK-39962] [runtime] Fix flaky DeclarativeSlotPoolBridgeTest#28535

Open
qiuyanjun888 wants to merge 1 commit into
apache:masterfrom
qiuyanjun888:fix/flink-39962-20260625-084608-a1
Open

[FLINK-39962] [runtime] Fix flaky DeclarativeSlotPoolBridgeTest#28535
qiuyanjun888 wants to merge 1 commit into
apache:masterfrom
qiuyanjun888:fix/flink-39962-20260625-084608-a1

Conversation

@qiuyanjun888

Copy link
Copy Markdown

What is the purpose of the change

This pull request fixes the flaky DeclarativeSlotPoolBridgeTest.testAcceptingOfferedSlotsWithoutResourceManagerConnected reported in FLINK-39962.

The Jira describes a test-only threading race: the test used the regular forMainThread() test executor, whose scheduled timeout callbacks can run concurrently with the test thread while close() copies pendingRequests, intermittently producing NegativeArraySizeException.

Brief change log

  • Use a ManuallyTriggeredScheduledExecutorService for the affected test.
  • Manually complete the deferred slot-request declaration task when slotRequestMaxInterval is positive.
  • Keep production slot-pool code unchanged and avoid triggering unrelated request/idle/batch timeout tasks in this test.

Verifying this change

This change is covered by the existing affected test and the containing test class:

  • ./mvnw -pl flink-runtime -Dtest=DeclarativeSlotPoolBridgeTest#testAcceptingOfferedSlotsWithoutResourceManagerConnected -DfailIfNoTests=false -DskipITs -Dfast -Drat.skip=true -Dcheckstyle.skip=true -Dspotless.check.skip=true test
  • ./mvnw -pl flink-runtime -Dtest=DeclarativeSlotPoolBridgeTest -DfailIfNoTests=false -DskipITs -Dfast -Drat.skip=true -Dcheckstyle.skip=true -Dspotless.check.skip=true test
  • ./mvnw -pl flink-runtime -DskipTests -DskipITs -Drat.skip=true spotless:check
  • ./mvnw -pl flink-runtime -DskipTests -DskipITs -Drat.skip=true checkstyle:check

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Generated-by: Hermes Agent (OpenAI GPT-5.5)

@flinkbot

flinkbot commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@spuru9 spuru9 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Can you work on making build green.

@github-actions github-actions Bot added the community-reviewed PR has been reviewed by the community. label Jun 26, 2026
@qiuyanjun888

qiuyanjun888 commented Jun 26, 2026

Copy link
Copy Markdown
Author

@flinkbot run azure

@qiuyanjun888

Copy link
Copy Markdown
Author

This PR is focused on runtime / slotpool flaky test.
There has already been community review/comment, but it still needs maintainer review.

@RocMarshal @1996fanrui could you please take a look when you have time and advise whether this approach is acceptable for this area and can move forward?


@TestTemplate
void testAcceptingOfferedSlotsWithoutResourceManagerConnected() throws Exception {
final ManuallyTriggeredScheduledExecutorService scheduledExecutor =

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious can this race occur in the main code ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants