-
Notifications
You must be signed in to change notification settings - Fork 340
feat(load/peak_ewma): expose round-trip time estimate #871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
b2c3330
fa0b122
3015e8c
708248a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -78,8 +78,8 @@ pub struct Handle { | |
| } | ||
|
|
||
| /// Holds the current RTT estimate and the last time this value was updated. | ||
| #[derive(Debug)] | ||
| struct RttEstimate { | ||
| #[derive(Clone, Debug)] | ||
| pub struct RttEstimate { | ||
| update_at: Instant, | ||
| rtt_ns: f64, | ||
| } | ||
|
|
@@ -107,6 +107,18 @@ impl<S, C> PeakEwma<S, C> { | |
| rtt_estimate: self.rtt_estimate.clone(), | ||
| } | ||
| } | ||
|
|
||
| /// Returns the current [`RttEstimate`] of the service. | ||
| /// | ||
| /// # Panics | ||
| /// | ||
| /// This value is stored in a mutex. If the mutex has become poisoned, this will panic. | ||
| pub fn rtt_estimate(&self) -> RttEstimate { | ||
| self.rtt_estimate | ||
| .lock() | ||
| .expect("mutex should not be poisoned") | ||
| .clone() | ||
| } | ||
| } | ||
|
|
||
| impl<S, C, Request> Service<Request> for PeakEwma<S, C> | ||
|
|
@@ -216,6 +228,16 @@ where | |
| // ===== impl RttEstimate ===== | ||
|
|
||
| impl RttEstimate { | ||
| /// Returns the [`Instant`] that this estimate was last updated. | ||
| pub fn updated_at(&self) -> Instant { | ||
| self.update_at | ||
| } | ||
|
|
||
| /// Returns the round-trip time estimate, in nanoseconds. | ||
| pub fn rtt_ns(&self) -> f64 { | ||
| self.rtt_ns | ||
| } | ||
|
Comment on lines
+231
to
+239
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. another way to do this would be making the inner i went this route to provide a stable public interface.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm, I don't even remember, why is nanoseconds stored in f64 instead of a u64 or u128? Is it to make use of the exponent to represent even bigger numbers in 64 bits? Would it be worth exposing the value publicly as an unsigned int?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or, a different crazy idea:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
in our case, the number does matter. you're right that we'd like to be able to take this RTT estimate, while also taking into account other properties of the traffic like response status codes.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
my understanding is that this is stored as a floating-point number because of the math in the // When an RTT is observed that is less than the estimated RTT, we decay the
// prior estimate according to how much time has elapsed since the last
// update. The inverse of the decay is used to scale the estimate towards the
// observed RTT value.
let elapsed = nanos(now.saturating_duration_since(self.update_at));
let decay = (-elapsed / decay_ns).exp();
let recency = 1.0 - decay;
let next_estimate = (self.rtt_ns * decay) + (rtt * recency);
trace!(
"update rtt={:03.0}ms decay={:06.0}ns; next={:03.0}ms",
rtt / NANOS_PER_MILLI,
self.rtt_ns - next_estimate,
next_estimate / NANOS_PER_MILLI,
);because of this, when we observe requests taking less time than the original peak, we decay the estimate exponentially, which can lead to fractional estimates. rounding to the nearest nanosecond, i.e. writing an inverse of the what do you think, @seanmonstar?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, gotcha. So, we don't have to do RTT math with floating points. The Linux kernel does TCP RTT/EWMA with integers. 🤷 But, it also could be just whatever, I guess if we change the types internally, the numbers will never be so large that a cast will lose anything.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. that's a great point. i can refactor this middleware to switch away from floating point numbers to integers, which would make exposing the current estimate more natural. i'll do that as a separate pull request, to keep each change easily reviewable.
Comment on lines
+236
to
+239
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i toyed briefly with finding a way to convert this into a |
||
|
|
||
| fn new(rtt_ns: f64) -> Self { | ||
| debug_assert!(0.0 < rtt_ns, "rtt must be positive"); | ||
| Self { | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.