Fix cron jobs stuck in "expired" when a run is interrupted mid-flight#48502
Fix cron jobs stuck in "expired" when a run is interrupted mid-flight#48502juan-fdz-hawa wants to merge 1 commit into
Conversation
|
@coderabbitai full review |
✅ Action performedFull review finished. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughA new 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
7f81a30 to
a761297
Compare
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/service/schedule/schedule_test.go`:
- Line 1062: The current test assertion around runWithStats only verifies that
expected persisted error keys are present, but it does not fail when extra
errors are stored. Tighten the contract in the schedule tests by asserting an
exact match for the persisted error key set instead of using subset/presence
checks, so the interrupted-run case catches unintended entries like additional
errors from runAllJobs. Update the assertions in the affected schedule test
cases that use wantErrorKeys to compare the full persisted error collection
exactly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2102eaa6-8afd-425a-93db-55ac53c49dae
📒 Files selected for processing (3)
changes/48497-cron-interrupted-run-statusserver/service/schedule/schedule.goserver/service/schedule/schedule_test.go
✅ Files skipped from review due to trivial changes (1)
- changes/48497-cron-interrupted-run-status
🚧 Files skipped from review as they are similar to previous changes (1)
- server/service/schedule/schedule.go
Fixes #48497 When a cron run's context was cancelled mid-flight (e.g. the instance received SIGTERM during graceful shutdown), the stats row was left "pending" because the terminal-status write failed on the cancelled context. CleanupCronStats would later reap it to "expired", hiding the fact that the run was interrupted and discarding the captured job errors. Record the terminal status on a context detached from cancellation (context.WithoutCancel with a bounded timeout) so an interrupted run persists its outcome. The run is marked "canceled" only when the context was cancelled AND a job actually reported an error, so a run whose jobs all finished cleanly is still "completed" even if cancellation merely raced the end of the run.
a761297 to
8cfd1f0
Compare
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #48502 +/- ##
==========================================
+ Coverage 67.40% 67.90% +0.50%
==========================================
Files 3677 3677
Lines 233664 233670 +6
Branches 12447 12447
==========================================
+ Hits 157492 158678 +1186
+ Misses 62057 60724 -1333
- Partials 14115 14268 +153
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Related issue: Fixes #48497
When a cron run's context was cancelled mid-flight (e.g. the instance received SIGTERM during graceful shutdown), the stats row was left "pending" because the terminal-status write failed on the cancelled context. CleanupCronStats would later reap it to "expired", hiding the fact that the run was interrupted and discarding the captured job errors.
Record the terminal status on a context detached from cancellation (context.WithoutCancel with a bounded timeout) so an interrupted run persists its outcome. The run is marked "canceled" only when the context was cancelled AND a job actually reported an error, so a run whose jobs all finished cleanly is still "completed" even if cancellation merely raced the end of the run.
Checklist for submitter
If some of the following don't apply, delete the relevant line.
changes/,orbit/changes/oree/fleetd-chrome/changes.See Changes files for more information.
Testing
Summary by CodeRabbit