OCPP: per-connection BootNotification handshake on reconnect#31192
OCPP: per-connection BootNotification handshake on reconnect#31192premultiply wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The use of the *time.Timer pointer as a connection identifier in onTransportConnect/onBootTimeout is subtle; consider introducing an explicit connection generation token (e.g. an incrementing int) to make the staleness check easier to reason about and less tightly coupled to the timer implementation.
- Access to bootTriggered is now spread across multiple call sites with manual lock handling; it might be clearer and less error-prone to encapsulate this flag behind small helper methods that consistently enforce locking and lifecycle (set on trigger, clear on consumption/timeout).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The use of the *time.Timer pointer as a connection identifier in onTransportConnect/onBootTimeout is subtle; consider introducing an explicit connection generation token (e.g. an incrementing int) to make the staleness check easier to reason about and less tightly coupled to the timer implementation.
- Access to bootTriggered is now spread across multiple call sites with manual lock handling; it might be clearer and less error-prone to encapsulate this flag behind small helper methods that consistently enforce locking and lifecycle (set on trigger, clear on consumption/timeout).
## Individual Comments
### Comment 1
<location path="charger/ocpp/cp_core_test.go" line_range="250-256" />
<code_context>
"setup should be called exactly once")
}
+
+func TestTriggeredBootNotificationDoesNotNotifyRebootMonitor(t *testing.T) {
+ log := util.NewLogger("test")
+ cp := NewChargePoint(log, "test-cp")
+ // metadata from a previous connection plus a solicited (triggered) BootNotification
+ cp.BootNotificationResult = &core.BootNotificationRequest{ChargePointModel: "Cached"}
+ cp.bootTriggered = true
+
+ _, err := cp.OnBootNotification(&core.BootNotificationRequest{
+ ChargePointModel: "Triggered",
+ ChargePointVendor: "TestVendor",
+ })
+ require.NoError(t, err)
+ assert.True(t, cp.Connected())
+ assert.False(t, cp.bootTriggered, "flag should be consumed")
+
+ select {
+ case req := <-cp.bootNotificationRequestC:
+ t.Fatalf("triggered BootNotification should not notify reboot monitor, got %s", req.ChargePointModel)
</code_context>
<issue_to_address>
**suggestion (testing):** Add an assertion that BootNotificationResult is updated by the triggered BootNotification
This test verifies that triggered BootNotifications don’t notify the reboot monitor and that `bootTriggered` is cleared, but it never checks that `BootNotificationResult` is refreshed with the new request. Please add an assertion like `assert.Equal(t, "Triggered", cp.BootNotificationResult.ChargePointModel)` so the test also validates the cached metadata is updated for solicited BootNotifications.
```suggestion
_, err := cp.OnBootNotification(&core.BootNotificationRequest{
ChargePointModel: "Triggered",
ChargePointVendor: "TestVendor",
})
require.NoError(t, err)
assert.True(t, cp.Connected())
assert.False(t, cp.bootTriggered, "flag should be consumed")
assert.NotNil(t, cp.BootNotificationResult, "BootNotificationResult should be set")
assert.Equal(t, "Triggered", cp.BootNotificationResult.ChargePointModel, "triggered BootNotification should refresh cached metadata")
```
</issue_to_address>
### Comment 2
<location path="charger/ocpp/cp_core_test.go" line_range="265-261" />
<code_context>
+ }
+}
+
+func TestSpontaneousBootNotificationNotifiesRebootMonitor(t *testing.T) {
+ log := util.NewLogger("test")
+ cp := NewChargePoint(log, "test-cp")
+
+ _, err := cp.OnBootNotification(&core.BootNotificationRequest{
+ ChargePointModel: "Rebooted",
+ ChargePointVendor: "TestVendor",
+ })
+ require.NoError(t, err)
+ assert.True(t, cp.Connected())
+
+ select {
+ case req := <-cp.bootNotificationRequestC:
+ assert.Equal(t, "Rebooted", req.ChargePointModel)
+ default:
+ t.Fatal("spontaneous BootNotification should notify reboot monitor")
+ }
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for a solicited BootNotification on a previous connection followed by a spontaneous BootNotification on a new connection
Current tests cover (a) solicited BootNotification not notifying the reboot monitor and (b) purely spontaneous BootNotification doing so. Please also cover the case where a solicited BootNotification occurs on an earlier connection (so `bootTriggered` is set and consumed), then the charge point reconnects and sends a new spontaneous BootNotification that must notify the reboot monitor. This ensures `bootTriggered` does not leak across connections. You can simulate this by: first marking `bootTriggered` and calling `OnBootNotification` with no reboot notification, then simulating a new connection (resetting state as `onTransportConnect` would) and calling `OnBootNotification` again, asserting the reboot monitor is notified.
Suggested implementation:
```golang
func TestSpontaneousBootNotificationNotifiesRebootMonitor(t *testing.T) {
log := util.NewLogger("test")
cp := NewChargePoint(log, "test-cp")
_, err := cp.OnBootNotification(&core.BootNotificationRequest{
ChargePointModel: "Rebooted",
ChargePointVendor: "TestVendor",
})
require.NoError(t, err)
assert.True(t, cp.Connected())
select {
case req := <-cp.bootNotificationRequestC:
assert.Equal(t, "Rebooted", req.ChargePointModel)
default:
t.Fatal("spontaneous BootNotification should notify reboot monitor")
}
}
func TestSolicitedBootOnPreviousConnectionThenSpontaneousBootOnNewConnectionNotifiesRebootMonitor(t *testing.T) {
log := util.NewLogger("test")
cp := NewChargePoint(log, "test-cp")
// First connection: simulate a solicited BootNotification
cp.bootTriggered = true
_, err := cp.OnBootNotification(&core.BootNotificationRequest{
ChargePointModel: "InitialSolicited",
ChargePointVendor: "TestVendor",
})
require.NoError(t, err)
assert.True(t, cp.Connected())
assert.False(t, cp.bootTriggered, "bootTriggered flag should be consumed after solicited BootNotification")
select {
case req := <-cp.bootNotificationRequestC:
t.Fatalf("solicited BootNotification should not notify reboot monitor, got %s", req.ChargePointModel)
default:
}
// Simulate a new transport connection, which should reset connection-scoped state
cp.onTransportConnect()
// New connection: spontaneous BootNotification must notify reboot monitor
_, err = cp.OnBootNotification(&core.BootNotificationRequest{
ChargePointModel: "RebootedSpontaneous",
ChargePointVendor: "TestVendor",
})
```
```golang
require.NoError(t, err)
assert.True(t, cp.Connected())
select {
case req := <-cp.bootNotificationRequestC:
assert.Equal(t, "RebootedSpontaneous", req.ChargePointModel)
default:
t.Fatal("spontaneous BootNotification on new connection should notify reboot monitor")
}
}
```
If the method to reset state on a new connection is not named `onTransportConnect` or is not directly callable from this test, adjust the call `cp.onTransportConnect()` to whatever helper you use to simulate a new transport connection (or expose a small test helper that wraps the real connection-initialization logic). The key requirement is that this call resets `bootTriggered` and any other connection-scoped state exactly as a real new connection would.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The reboot-suppression logic in OnBootNotification relies on
*prev == *requestforcore.BootNotificationRequest; consider extracting this into a dedicated comparison helper that clearly documents which fields are treated as identity and avoids unintended coupling to future struct field additions. - The
bootTriggeredflag is now central to differentiating handshake vs. reboot; it may be worth adding a short comment near its field definition summarizing its lifecycle (set only by the trigger, cleared on boot/timeout/new connection) to make the concurrency assumptions and non-leakage across connections more apparent to future maintainers.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The reboot-suppression logic in OnBootNotification relies on `*prev == *request` for `core.BootNotificationRequest`; consider extracting this into a dedicated comparison helper that clearly documents which fields are treated as identity and avoids unintended coupling to future struct field additions.
- The `bootTriggered` flag is now central to differentiating handshake vs. reboot; it may be worth adding a short comment near its field definition summarizing its lifecycle (set only by the trigger, cleared on boot/timeout/new connection) to make the concurrency assumptions and non-leakage across connections more apparent to future maintainers.
## Individual Comments
### Comment 1
<location path="charger/ocpp/cp_core_test.go" line_range="273" />
<code_context>
+// TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a
+// charger that actually restarted (e.g. firmware update) but whose boot arrives as
+// a solicited one: the changed content must still trigger a setup re-initialization.
+func TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor(t *testing.T) {
+ log := util.NewLogger("test")
+ cp := NewChargePoint(log, "test-cp")
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for the case where a solicited BootNotification is the first-ever boot (no previous metadata)
The updated logic only suppresses reboot notifications when `bootTriggered && prev != nil && *prev == *request`. Current tests cover solicited boots with unchanged vs changed cached metadata. Please add a test for a first connection where `bootTriggered == true` and `BootNotificationResult` is `nil`, and assert that the reboot-monitor channel receives the request. This will lock in the behavior that first boots always notify the reboot monitor, distinguishing them from reconnects.
Suggested implementation:
```golang
func TestTriggeredBootNotificationFirstSolicitedBootNotifiesRebootMonitor(t *testing.T) {
log := util.NewLogger("test")
cp := NewChargePoint(log, "test-cp")
boot := core.BootNotificationRequest{
ChargePointModel: "Model",
ChargePointVendor: "TestVendor",
FirmwareVersion: "1.0.0",
}
// first-ever connection: no previous metadata, solicited boot
cp.BootNotificationResult = nil
cp.bootTriggered = true
// act: handle solicited boot notification
cp.handleBootNotificationRequest(context.Background(), &boot)
// assert: reboot monitor is notified
select {
case req := <-cp.bootNotificationRequestC:
if req.ChargePointModel != boot.ChargePointModel ||
req.ChargePointVendor != boot.ChargePointVendor ||
req.FirmwareVersion != boot.FirmwareVersion {
t.Fatalf("unexpected boot notification in reboot monitor channel: got %+v, want %+v", req, boot)
}
case <-time.After(time.Second):
t.Fatal("expected first solicited BootNotification to notify reboot monitor")
}
}
// TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a
// charger that actually restarted (e.g. firmware update) but whose boot arrives as
// a solicited one: the changed content must still trigger a setup re-initialization.
```
To integrate this test with the rest of your codebase:
1. Ensure `context` and `time` are imported in `cp_core_test.go` if they are not already:
- `import "context"`
- `import "time"`
2. Replace `cp.handleBootNotificationRequest(context.Background(), &boot)` with the actual helper/function you use in other tests to feed a `core.BootNotificationRequest` through the charge point logic (e.g., `cp.OnBootNotification`, `cp.HandleBootNotification`, etc.), matching its signature and usage pattern.
3. If `bootNotificationRequestC` is a channel of `core.BootNotificationRequest` rather than pointers, adjust the select-case accordingly (remove the pointer dereference and update the comparisons).
4. If you already have a shared timeout constant or helper for channel assertions in this test file, use that instead of `time.Second` for consistency.
</issue_to_address>
### Comment 2
<location path="charger/ocpp/cp_core_test.go" line_range="302" />
<code_context>
+// TestBootTimeoutClearsTriggeredFlag ensures a solicited BootNotification that never
+// arrives does not leak the bootTriggered flag past the connection: the boot timeout
+// must clear it so the next connection's spontaneous boot is treated as a reboot.
+func TestBootTimeoutClearsTriggeredFlag(t *testing.T) {
+ log := util.NewLogger("test")
+ cp := NewChargePoint(log, "test-cp")
</code_context>
<issue_to_address>
**nitpick (testing):** Avoid using a 1-hour timer in the test to reduce flakiness and stray goroutines
In `TestBootTimeoutClearsTriggeredFlag`, `time.AfterFunc(time.Hour, func() {})` is only used to obtain a `*time.Timer`, but it leaves a long-lived timer and goroutine that will fire later in the background. To avoid extra goroutines and potential flakiness, create a short-lived timer and stop it immediately (e.g. `timer := time.NewTimer(time.Second); timer.Stop()`), or use `time.AfterFunc(0, ...)` and stop it right away. The delay doesn’t matter here—only the timer’s identity does—so prefer a timer that doesn’t linger.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| // TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a | ||
| // charger that actually restarted (e.g. firmware update) but whose boot arrives as | ||
| // a solicited one: the changed content must still trigger a setup re-initialization. | ||
| func TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor(t *testing.T) { |
There was a problem hiding this comment.
suggestion (testing): Add a test for the case where a solicited BootNotification is the first-ever boot (no previous metadata)
The updated logic only suppresses reboot notifications when bootTriggered && prev != nil && *prev == *request. Current tests cover solicited boots with unchanged vs changed cached metadata. Please add a test for a first connection where bootTriggered == true and BootNotificationResult is nil, and assert that the reboot-monitor channel receives the request. This will lock in the behavior that first boots always notify the reboot monitor, distinguishing them from reconnects.
Suggested implementation:
func TestTriggeredBootNotificationFirstSolicitedBootNotifiesRebootMonitor(t *testing.T) {
log := util.NewLogger("test")
cp := NewChargePoint(log, "test-cp")
boot := core.BootNotificationRequest{
ChargePointModel: "Model",
ChargePointVendor: "TestVendor",
FirmwareVersion: "1.0.0",
}
// first-ever connection: no previous metadata, solicited boot
cp.BootNotificationResult = nil
cp.bootTriggered = true
// act: handle solicited boot notification
cp.handleBootNotificationRequest(context.Background(), &boot)
// assert: reboot monitor is notified
select {
case req := <-cp.bootNotificationRequestC:
if req.ChargePointModel != boot.ChargePointModel ||
req.ChargePointVendor != boot.ChargePointVendor ||
req.FirmwareVersion != boot.FirmwareVersion {
t.Fatalf("unexpected boot notification in reboot monitor channel: got %+v, want %+v", req, boot)
}
case <-time.After(time.Second):
t.Fatal("expected first solicited BootNotification to notify reboot monitor")
}
}
// TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a
// charger that actually restarted (e.g. firmware update) but whose boot arrives as
// a solicited one: the changed content must still trigger a setup re-initialization.To integrate this test with the rest of your codebase:
- Ensure
contextandtimeare imported incp_core_test.goif they are not already:import "context"import "time"
- Replace
cp.handleBootNotificationRequest(context.Background(), &boot)with the actual helper/function you use in other tests to feed acore.BootNotificationRequestthrough the charge point logic (e.g.,cp.OnBootNotification,cp.HandleBootNotification, etc.), matching its signature and usage pattern. - If
bootNotificationRequestCis a channel ofcore.BootNotificationRequestrather than pointers, adjust the select-case accordingly (remove the pointer dereference and update the comparisons). - If you already have a shared timeout constant or helper for channel assertions in this test file, use that instead of
time.Secondfor consistency.
| // TestBootTimeoutClearsTriggeredFlag ensures a solicited BootNotification that never | ||
| // arrives does not leak the bootTriggered flag past the connection: the boot timeout | ||
| // must clear it so the next connection's spontaneous boot is treated as a reboot. | ||
| func TestBootTimeoutClearsTriggeredFlag(t *testing.T) { |
There was a problem hiding this comment.
nitpick (testing): Avoid using a 1-hour timer in the test to reduce flakiness and stray goroutines
In TestBootTimeoutClearsTriggeredFlag, time.AfterFunc(time.Hour, func() {}) is only used to obtain a *time.Timer, but it leaves a long-lived timer and goroutine that will fire later in the background. To avoid extra goroutines and potential flakiness, create a short-lived timer and stop it immediately (e.g. timer := time.NewTimer(time.Second); timer.Stop()), or use time.AfterFunc(0, ...) and stop it right away. The delay doesn’t matter here—only the timer’s identity does—so prefer a timer that doesn’t linger.
Problem
evcc waits for a BootNotification before treating a websocket session as ready and proactively triggers one (via
TriggerMessage, added in #28540) if the charger stays silent. The proactive trigger was skipped whenever BootNotification metadata from an earlier session was already cached (BootNotificationResult != nil).Some chargers (e.g. Webasto/Ampure NEXT) do not re-send BootNotification on a new websocket once they consider themselves accepted by the backend. With cached metadata present the trigger never fired, so the session stayed stuck until the full timeout fallback.
Fix
BootNotification readiness is tracked per websocket connection instead of using cached metadata as a proxy:
BootNotificationResult, so a reconnect is handled even when earlier metadata exists.Implementation
*time.Timerpointer itself identifies the connection. Delayed callbacks (trigger / timeout) capture their own timer and bail out oncecp.bootTimerno longer equals it (replaced bystopBootTimeron disconnect or by a neweronTransportConnect).bootTriggeredbool, set just before theTriggerMessageand consumed inOnBootNotification. Reset on every new connection and on boot timeout, so it cannot leak across connections.Reboot detection survives the suppression (content comparison)
Suppressing full setup run for solicited boots could otherwise hide a genuine restart whose BootNotification happens to arrive as the solicited one (e.g. a firmware update whose post-reboot boot lands after the proactive 5s trigger). To close this, a solicited BootNotification is only suppressed when its content is unchanged. If any identity field differs (firmware version, model, serial, …) the charger really restarted, so it is forwarded to the reboot monitor and setup re-runs — independent of timing.
Removed the redundant Setup BootNotification fallback
Setupcontained a second BootNotification trigger+wait that only ran whenBootNotificationResult == nil. That field is set on every BootNotification and never reset, so it can only be nil on the very firstSetupand only if the per-connection handshake above already timed out without any BootNotification — every reconnect and reboot-driven re-run sees it non-nil and skipped the block anyway. The handshake now reliably obtains the BootNotification (spontaneous or proactively triggered) beforeSetupruns, so this fallback covered no real gap. Removed; the only observable effect of a nil result is the diagnostics dump omitting vendor/model on that rare first-connect timeout path.Fixes part of #31158