Skip to content

Add JobStuckHandler callback#1291

Merged
brandur merged 1 commit into
masterfrom
brandur-stuck-job-handler
Jun 29, 2026
Merged

Add JobStuckHandler callback#1291
brandur merged 1 commit into
masterfrom
brandur-stuck-job-handler

Conversation

@brandur

@brandur brandur commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Here, add a JobStuckHandler callback that's invoked when a producer
consider a job to be "stuck". i.e. Passed its timeout, cancellation
attempted, but job didn't respond to cancellation.

The callback includes some basic information about the job that became
stuck, along with the total number of stuck jobs. A result will
optionally open a new executor slot to replace the one taken up by the
stuck job so that a producer that continues to be run doesn't get
completed starved by stuck jobs. The idea here is that clients can
configure themselves to open new slots up to a certain point, but then
may want to restart themselves if there's enough jobs stuck that they
could become a memory liability.

@brandur

brandur commented Jun 20, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

Reviewed commit: 098d829c68

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@brandur brandur force-pushed the brandur-stuck-job-handler branch from 098d829 to 93b3691 Compare June 20, 2026 03:41
@brandur brandur requested a review from bgentry June 20, 2026 03:46

@bgentry bgentry left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a changelog entry and one suggestion on naming, otherwise LGTM!

Comment thread stuck_job.go Outdated
Comment on lines +27 to +31
// OpenWorkerSlot instructs River to treat the stuck job as no longer
// occupying a worker slot so another job can begin executing. This can be
// dangerous because the stuck job's goroutine is still running, so the queue
// may temporarily have more active job goroutines than MaxWorkers.
OpenWorkerSlot bool

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if ReleaseWorkerSlot might be slightly clearer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think "open" is a little more correct here — "release" seems to imply that an existing slot is being released, which is not what's happening (unfortunately, we can't release the existing slot). "Open" is better IMO because it says you're "opening a new slot". That said, maybe AddWorkerSlot is better than either as it's even clearer. Changed to that.

@brandur brandur force-pushed the brandur-stuck-job-handler branch 2 times, most recently from b5106b0 to e767dfe Compare June 29, 2026 23:14
Here, add a `JobStuckHandler` callback that's invoked when a producer
consider a job to be "stuck". i.e. Passed its timeout, cancellation
attempted, but job didn't respond to cancellation.

The callback includes some basic information about the job that became
stuck, along with the total number of stuck jobs. A result will
optionally open a new executor slot to replace the one taken up by the
stuck job so that a producer that continues to be run doesn't get
completed starved by stuck jobs. The idea here is that clients can
configure themselves to open new slots up to a certain point, but then
may want to restart themselves if there's enough jobs stuck that they
could become a memory liability.
@brandur brandur force-pushed the brandur-stuck-job-handler branch from e767dfe to 42e5151 Compare June 29, 2026 23:34
@brandur brandur merged commit 78ae535 into master Jun 29, 2026
15 checks passed
@brandur brandur deleted the brandur-stuck-job-handler branch June 29, 2026 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants