Skip to content

fix(benchmark): raise RLIMIT_NOFILE in benchmark_serving for high concurrency#1394

Merged
valarLip merged 1 commit into
mainfrom
zlr/benchmark-set-ulimit
Jun 29, 2026
Merged

fix(benchmark): raise RLIMIT_NOFILE in benchmark_serving for high concurrency#1394
valarLip merged 1 commit into
mainfrom
zlr/benchmark-set-ulimit

Conversation

@ZhangLirong-amd

@ZhangLirong-amd ZhangLirong-amd commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

At high --max-concurrency, each in-flight request holds a socket fd. The default soft RLIMIT_NOFILE (~1024) is exhausted client-side (EMFILE on socket()), so most requests fail before ever reaching the server. The benchmark then reports only ~one concurrency-wave of successes (e.g. ~919/10240 at --max-concurrency=1024) while the server logs 200 OK for every request it actually receives.

The server already calls set_ulimit() at startup; this makes the benchmark client do the same (raise soft toward 65535, capped at the hard limit) before opening any connections.

This is purely a client-side fd-limit fix — no change to request logic.

Root cause (for context)

Containers launched without --ulimit nofile inherit containerd's soft limit (1024 on hosts where systemd DefaultLimitNOFILESoft=1024). After a containerd 1.7→2.x upgrade, plainly-launched containers stopped getting a high default, surfacing this on the benchmark client.

Test plan

  • python -m atom.benchmarks.benchmark_serving --dataset-name random --random-input-len 1024 --random-output-len 1024 --num-prompts 10240 --max-concurrency 1024 --request-rate inf --ignore-eos -> all 10240 succeed (previously ~919).
  • Confirm set_ulimit() runs before the first request.

Copilot AI review requested due to automatic review settings June 29, 2026 06:00

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to make the benchmark client resilient at high --max-concurrency by raising the process RLIMIT_NOFILE soft limit before opening many simultaneous sockets, preventing client-side EMFILE failures that drop requests.

Changes:

  • Raise RLIMIT_NOFILE at the start of benchmark_serving.py:main() via set_ulimit().
  • Add explanatory comments describing why the benchmark client needs the ulimit bump (mirroring server behavior).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +689 to +696
# Raise the open-file soft limit before opening any connections. At high
# --max-concurrency each in-flight request is a socket (fd); the default
# RLIMIT_NOFILE soft (~1024) is exhausted client-side (EMFILE on socket()),
# silently dropping requests so most never reach the server. The server
# already calls set_ulimit() at startup; the client must too.
from atom.utils import set_ulimit

set_ulimit()
…currency

At high --max-concurrency each in-flight request holds a socket fd. The
default soft RLIMIT_NOFILE (~1024) is exhausted client-side (EMFILE on
socket()), so most requests fail before reaching the server and the run
reports only ~one concurrency-wave of successes (e.g. ~919/10240 at
conc=1024) while the server logs 200 OK for every request it actually
receives. The server already calls set_ulimit() at startup; call it in the
benchmark client too (soft is raised toward 65535, capped at the hard limit).
@ZhangLirong-amd ZhangLirong-amd force-pushed the zlr/benchmark-set-ulimit branch from ae94fc8 to 8ec57c2 Compare June 29, 2026 06:02
@valarLip valarLip merged commit f797dd5 into main Jun 29, 2026
27 of 35 checks passed
@valarLip valarLip deleted the zlr/benchmark-set-ulimit branch June 29, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants