Skip to content

OCPP: per-connection BootNotification handshake on reconnect#31192

Open
premultiply wants to merge 5 commits into
masterfrom
ocpp-bootnotification-reconnect
Open

OCPP: per-connection BootNotification handshake on reconnect#31192
premultiply wants to merge 5 commits into
masterfrom
ocpp-bootnotification-reconnect

Conversation

@premultiply

@premultiply premultiply commented Jun 24, 2026

Copy link
Copy Markdown
Member

Problem

evcc waits for a BootNotification before treating a websocket session as ready and proactively triggers one (via TriggerMessage, added in #28540) if the charger stays silent. The proactive trigger was skipped whenever BootNotification metadata from an earlier session was already cached (BootNotificationResult != nil).

Some chargers (e.g. Webasto/Ampure NEXT) do not re-send BootNotification on a new websocket once they consider themselves accepted by the backend. With cached metadata present the trigger never fired, so the session stayed stuck until the full timeout fallback.

Fix

BootNotification readiness is tracked per websocket connection instead of using cached metadata as a proxy:

  • The proactive BootNotification trigger now fires based on the current connection's boot timer, independent of any cached BootNotificationResult, so a reconnect is handled even when earlier metadata exists.
  • A BootNotification that evcc solicited this way is only a connection handshake, not a real charger reboot, and is no longer forwarded to the reboot monitor — so it no longer causes unnecessary setup reinitialization on every reconnect.

Implementation

  • The *time.Timer pointer itself identifies the connection. Delayed callbacks (trigger / timeout) capture their own timer and bail out once cp.bootTimer no longer equals it (replaced by stopBootTimer on disconnect or by a newer onTransportConnect).
  • A single bootTriggered bool, set just before the TriggerMessage and consumed in OnBootNotification. Reset on every new connection and on boot timeout, so it cannot leak across connections.

Reboot detection survives the suppression (content comparison)

Suppressing full setup run for solicited boots could otherwise hide a genuine restart whose BootNotification happens to arrive as the solicited one (e.g. a firmware update whose post-reboot boot lands after the proactive 5s trigger). To close this, a solicited BootNotification is only suppressed when its content is unchanged. If any identity field differs (firmware version, model, serial, …) the charger really restarted, so it is forwarded to the reboot monitor and setup re-runs — independent of timing.

BootNotification content vs. previous result
spontaneous re-init (real reboot)
solicited unchanged suppressed (plain reconnect)
solicited changed (e.g. new firmware) re-init

Removed the redundant Setup BootNotification fallback

Setup contained a second BootNotification trigger+wait that only ran when BootNotificationResult == nil. That field is set on every BootNotification and never reset, so it can only be nil on the very first Setup and only if the per-connection handshake above already timed out without any BootNotification — every reconnect and reboot-driven re-run sees it non-nil and skipped the block anyway. The handshake now reliably obtains the BootNotification (spontaneous or proactively triggered) before Setup runs, so this fallback covered no real gap. Removed; the only observable effect of a nil result is the diagnostics dump omitting vendor/model on that rare first-connect timeout path.

Fixes part of #31158

Copilot AI review requested due to automatic review settings June 24, 2026 21:18

This comment was marked as low quality.

@premultiply premultiply marked this pull request as draft June 24, 2026 21:18

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The use of the *time.Timer pointer as a connection identifier in onTransportConnect/onBootTimeout is subtle; consider introducing an explicit connection generation token (e.g. an incrementing int) to make the staleness check easier to reason about and less tightly coupled to the timer implementation.
  • Access to bootTriggered is now spread across multiple call sites with manual lock handling; it might be clearer and less error-prone to encapsulate this flag behind small helper methods that consistently enforce locking and lifecycle (set on trigger, clear on consumption/timeout).
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The use of the *time.Timer pointer as a connection identifier in onTransportConnect/onBootTimeout is subtle; consider introducing an explicit connection generation token (e.g. an incrementing int) to make the staleness check easier to reason about and less tightly coupled to the timer implementation.
- Access to bootTriggered is now spread across multiple call sites with manual lock handling; it might be clearer and less error-prone to encapsulate this flag behind small helper methods that consistently enforce locking and lifecycle (set on trigger, clear on consumption/timeout).

## Individual Comments

### Comment 1
<location path="charger/ocpp/cp_core_test.go" line_range="250-256" />
<code_context>
 		"setup should be called exactly once")
 }
+
+func TestTriggeredBootNotificationDoesNotNotifyRebootMonitor(t *testing.T) {
+	log := util.NewLogger("test")
+	cp := NewChargePoint(log, "test-cp")
+	// metadata from a previous connection plus a solicited (triggered) BootNotification
+	cp.BootNotificationResult = &core.BootNotificationRequest{ChargePointModel: "Cached"}
+	cp.bootTriggered = true
+
+	_, err := cp.OnBootNotification(&core.BootNotificationRequest{
+		ChargePointModel:  "Triggered",
+		ChargePointVendor: "TestVendor",
+	})
+	require.NoError(t, err)
+	assert.True(t, cp.Connected())
+	assert.False(t, cp.bootTriggered, "flag should be consumed")
+
+	select {
+	case req := <-cp.bootNotificationRequestC:
+		t.Fatalf("triggered BootNotification should not notify reboot monitor, got %s", req.ChargePointModel)
</code_context>
<issue_to_address>
**suggestion (testing):** Add an assertion that BootNotificationResult is updated by the triggered BootNotification

This test verifies that triggered BootNotifications don’t notify the reboot monitor and that `bootTriggered` is cleared, but it never checks that `BootNotificationResult` is refreshed with the new request. Please add an assertion like `assert.Equal(t, "Triggered", cp.BootNotificationResult.ChargePointModel)` so the test also validates the cached metadata is updated for solicited BootNotifications.

```suggestion
	_, err := cp.OnBootNotification(&core.BootNotificationRequest{
		ChargePointModel:  "Triggered",
		ChargePointVendor: "TestVendor",
	})
	require.NoError(t, err)
	assert.True(t, cp.Connected())
	assert.False(t, cp.bootTriggered, "flag should be consumed")
	assert.NotNil(t, cp.BootNotificationResult, "BootNotificationResult should be set")
	assert.Equal(t, "Triggered", cp.BootNotificationResult.ChargePointModel, "triggered BootNotification should refresh cached metadata")
```
</issue_to_address>

### Comment 2
<location path="charger/ocpp/cp_core_test.go" line_range="265-261" />
<code_context>
+	}
+}
+
+func TestSpontaneousBootNotificationNotifiesRebootMonitor(t *testing.T) {
+	log := util.NewLogger("test")
+	cp := NewChargePoint(log, "test-cp")
+
+	_, err := cp.OnBootNotification(&core.BootNotificationRequest{
+		ChargePointModel:  "Rebooted",
+		ChargePointVendor: "TestVendor",
+	})
+	require.NoError(t, err)
+	assert.True(t, cp.Connected())
+
+	select {
+	case req := <-cp.bootNotificationRequestC:
+		assert.Equal(t, "Rebooted", req.ChargePointModel)
+	default:
+		t.Fatal("spontaneous BootNotification should notify reboot monitor")
+	}
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for a solicited BootNotification on a previous connection followed by a spontaneous BootNotification on a new connection

Current tests cover (a) solicited BootNotification not notifying the reboot monitor and (b) purely spontaneous BootNotification doing so. Please also cover the case where a solicited BootNotification occurs on an earlier connection (so `bootTriggered` is set and consumed), then the charge point reconnects and sends a new spontaneous BootNotification that must notify the reboot monitor. This ensures `bootTriggered` does not leak across connections. You can simulate this by: first marking `bootTriggered` and calling `OnBootNotification` with no reboot notification, then simulating a new connection (resetting state as `onTransportConnect` would) and calling `OnBootNotification` again, asserting the reboot monitor is notified.

Suggested implementation:

```golang
func TestSpontaneousBootNotificationNotifiesRebootMonitor(t *testing.T) {
	log := util.NewLogger("test")
	cp := NewChargePoint(log, "test-cp")

	_, err := cp.OnBootNotification(&core.BootNotificationRequest{
		ChargePointModel:  "Rebooted",
		ChargePointVendor: "TestVendor",
	})
	require.NoError(t, err)
	assert.True(t, cp.Connected())

	select {
	case req := <-cp.bootNotificationRequestC:
		assert.Equal(t, "Rebooted", req.ChargePointModel)
	default:
		t.Fatal("spontaneous BootNotification should notify reboot monitor")
	}
}

func TestSolicitedBootOnPreviousConnectionThenSpontaneousBootOnNewConnectionNotifiesRebootMonitor(t *testing.T) {
	log := util.NewLogger("test")
	cp := NewChargePoint(log, "test-cp")

	// First connection: simulate a solicited BootNotification
	cp.bootTriggered = true

	_, err := cp.OnBootNotification(&core.BootNotificationRequest{
		ChargePointModel:  "InitialSolicited",
		ChargePointVendor: "TestVendor",
	})
	require.NoError(t, err)
	assert.True(t, cp.Connected())
	assert.False(t, cp.bootTriggered, "bootTriggered flag should be consumed after solicited BootNotification")

	select {
	case req := <-cp.bootNotificationRequestC:
		t.Fatalf("solicited BootNotification should not notify reboot monitor, got %s", req.ChargePointModel)
	default:
	}

	// Simulate a new transport connection, which should reset connection-scoped state
	cp.onTransportConnect()

	// New connection: spontaneous BootNotification must notify reboot monitor
	_, err = cp.OnBootNotification(&core.BootNotificationRequest{
		ChargePointModel:  "RebootedSpontaneous",
		ChargePointVendor: "TestVendor",
	})

```

```golang
	require.NoError(t, err)
	assert.True(t, cp.Connected())

	select {
	case req := <-cp.bootNotificationRequestC:
		assert.Equal(t, "RebootedSpontaneous", req.ChargePointModel)
	default:
		t.Fatal("spontaneous BootNotification on new connection should notify reboot monitor")
	}
}

```

If the method to reset state on a new connection is not named `onTransportConnect` or is not directly callable from this test, adjust the call `cp.onTransportConnect()` to whatever helper you use to simulate a new transport connection (or expose a small test helper that wraps the real connection-initialization logic). The key requirement is that this call resets `bootTriggered` and any other connection-scoped state exactly as a real new connection would.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread charger/ocpp/cp_core_test.go Outdated
Comment thread charger/ocpp/cp_core_test.go
@premultiply premultiply changed the title OCPP: fix BootNotification handshake on reconnect OCPP: per-connection BootNotification handshake on reconnect Jun 24, 2026
@premultiply premultiply added the devices Specific device support label Jun 24, 2026
@premultiply premultiply marked this pull request as ready for review June 24, 2026 22:45

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The reboot-suppression logic in OnBootNotification relies on *prev == *request for core.BootNotificationRequest; consider extracting this into a dedicated comparison helper that clearly documents which fields are treated as identity and avoids unintended coupling to future struct field additions.
  • The bootTriggered flag is now central to differentiating handshake vs. reboot; it may be worth adding a short comment near its field definition summarizing its lifecycle (set only by the trigger, cleared on boot/timeout/new connection) to make the concurrency assumptions and non-leakage across connections more apparent to future maintainers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The reboot-suppression logic in OnBootNotification relies on `*prev == *request` for `core.BootNotificationRequest`; consider extracting this into a dedicated comparison helper that clearly documents which fields are treated as identity and avoids unintended coupling to future struct field additions.
- The `bootTriggered` flag is now central to differentiating handshake vs. reboot; it may be worth adding a short comment near its field definition summarizing its lifecycle (set only by the trigger, cleared on boot/timeout/new connection) to make the concurrency assumptions and non-leakage across connections more apparent to future maintainers.

## Individual Comments

### Comment 1
<location path="charger/ocpp/cp_core_test.go" line_range="273" />
<code_context>
+// TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a
+// charger that actually restarted (e.g. firmware update) but whose boot arrives as
+// a solicited one: the changed content must still trigger a setup re-initialization.
+func TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor(t *testing.T) {
+	log := util.NewLogger("test")
+	cp := NewChargePoint(log, "test-cp")
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for the case where a solicited BootNotification is the first-ever boot (no previous metadata)

The updated logic only suppresses reboot notifications when `bootTriggered && prev != nil && *prev == *request`. Current tests cover solicited boots with unchanged vs changed cached metadata. Please add a test for a first connection where `bootTriggered == true` and `BootNotificationResult` is `nil`, and assert that the reboot-monitor channel receives the request. This will lock in the behavior that first boots always notify the reboot monitor, distinguishing them from reconnects.

Suggested implementation:

```golang
func TestTriggeredBootNotificationFirstSolicitedBootNotifiesRebootMonitor(t *testing.T) {
	log := util.NewLogger("test")
	cp := NewChargePoint(log, "test-cp")

	boot := core.BootNotificationRequest{
		ChargePointModel:  "Model",
		ChargePointVendor: "TestVendor",
		FirmwareVersion:   "1.0.0",
	}

	// first-ever connection: no previous metadata, solicited boot
	cp.BootNotificationResult = nil
	cp.bootTriggered = true

	// act: handle solicited boot notification
	cp.handleBootNotificationRequest(context.Background(), &boot)

	// assert: reboot monitor is notified
	select {
	case req := <-cp.bootNotificationRequestC:
		if req.ChargePointModel != boot.ChargePointModel ||
			req.ChargePointVendor != boot.ChargePointVendor ||
			req.FirmwareVersion != boot.FirmwareVersion {
			t.Fatalf("unexpected boot notification in reboot monitor channel: got %+v, want %+v", req, boot)
		}
	case <-time.After(time.Second):
		t.Fatal("expected first solicited BootNotification to notify reboot monitor")
	}
}

// TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a
// charger that actually restarted (e.g. firmware update) but whose boot arrives as
// a solicited one: the changed content must still trigger a setup re-initialization.

```

To integrate this test with the rest of your codebase:

1. Ensure `context` and `time` are imported in `cp_core_test.go` if they are not already:
   - `import "context"`
   - `import "time"`
2. Replace `cp.handleBootNotificationRequest(context.Background(), &boot)` with the actual helper/function you use in other tests to feed a `core.BootNotificationRequest` through the charge point logic (e.g., `cp.OnBootNotification`, `cp.HandleBootNotification`, etc.), matching its signature and usage pattern.
3. If `bootNotificationRequestC` is a channel of `core.BootNotificationRequest` rather than pointers, adjust the select-case accordingly (remove the pointer dereference and update the comparisons).
4. If you already have a shared timeout constant or helper for channel assertions in this test file, use that instead of `time.Second` for consistency.
</issue_to_address>

### Comment 2
<location path="charger/ocpp/cp_core_test.go" line_range="302" />
<code_context>
+// TestBootTimeoutClearsTriggeredFlag ensures a solicited BootNotification that never
+// arrives does not leak the bootTriggered flag past the connection: the boot timeout
+// must clear it so the next connection's spontaneous boot is treated as a reboot.
+func TestBootTimeoutClearsTriggeredFlag(t *testing.T) {
+	log := util.NewLogger("test")
+	cp := NewChargePoint(log, "test-cp")
</code_context>
<issue_to_address>
**nitpick (testing):** Avoid using a 1-hour timer in the test to reduce flakiness and stray goroutines

In `TestBootTimeoutClearsTriggeredFlag`, `time.AfterFunc(time.Hour, func() {})` is only used to obtain a `*time.Timer`, but it leaves a long-lived timer and goroutine that will fire later in the background. To avoid extra goroutines and potential flakiness, create a short-lived timer and stop it immediately (e.g. `timer := time.NewTimer(time.Second); timer.Stop()`), or use `time.AfterFunc(0, ...)` and stop it right away. The delay doesn’t matter here—only the timer’s identity does—so prefer a timer that doesn’t linger.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

// TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a
// charger that actually restarted (e.g. firmware update) but whose boot arrives as
// a solicited one: the changed content must still trigger a setup re-initialization.
func TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor(t *testing.T) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test for the case where a solicited BootNotification is the first-ever boot (no previous metadata)

The updated logic only suppresses reboot notifications when bootTriggered && prev != nil && *prev == *request. Current tests cover solicited boots with unchanged vs changed cached metadata. Please add a test for a first connection where bootTriggered == true and BootNotificationResult is nil, and assert that the reboot-monitor channel receives the request. This will lock in the behavior that first boots always notify the reboot monitor, distinguishing them from reconnects.

Suggested implementation:

func TestTriggeredBootNotificationFirstSolicitedBootNotifiesRebootMonitor(t *testing.T) {
	log := util.NewLogger("test")
	cp := NewChargePoint(log, "test-cp")

	boot := core.BootNotificationRequest{
		ChargePointModel:  "Model",
		ChargePointVendor: "TestVendor",
		FirmwareVersion:   "1.0.0",
	}

	// first-ever connection: no previous metadata, solicited boot
	cp.BootNotificationResult = nil
	cp.bootTriggered = true

	// act: handle solicited boot notification
	cp.handleBootNotificationRequest(context.Background(), &boot)

	// assert: reboot monitor is notified
	select {
	case req := <-cp.bootNotificationRequestC:
		if req.ChargePointModel != boot.ChargePointModel ||
			req.ChargePointVendor != boot.ChargePointVendor ||
			req.FirmwareVersion != boot.FirmwareVersion {
			t.Fatalf("unexpected boot notification in reboot monitor channel: got %+v, want %+v", req, boot)
		}
	case <-time.After(time.Second):
		t.Fatal("expected first solicited BootNotification to notify reboot monitor")
	}
}

// TestTriggeredBootNotificationWithChangedIdentityNotifiesRebootMonitor covers a
// charger that actually restarted (e.g. firmware update) but whose boot arrives as
// a solicited one: the changed content must still trigger a setup re-initialization.

To integrate this test with the rest of your codebase:

  1. Ensure context and time are imported in cp_core_test.go if they are not already:
    • import "context"
    • import "time"
  2. Replace cp.handleBootNotificationRequest(context.Background(), &boot) with the actual helper/function you use in other tests to feed a core.BootNotificationRequest through the charge point logic (e.g., cp.OnBootNotification, cp.HandleBootNotification, etc.), matching its signature and usage pattern.
  3. If bootNotificationRequestC is a channel of core.BootNotificationRequest rather than pointers, adjust the select-case accordingly (remove the pointer dereference and update the comparisons).
  4. If you already have a shared timeout constant or helper for channel assertions in this test file, use that instead of time.Second for consistency.

// TestBootTimeoutClearsTriggeredFlag ensures a solicited BootNotification that never
// arrives does not leak the bootTriggered flag past the connection: the boot timeout
// must clear it so the next connection's spontaneous boot is treated as a reboot.
func TestBootTimeoutClearsTriggeredFlag(t *testing.T) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (testing): Avoid using a 1-hour timer in the test to reduce flakiness and stray goroutines

In TestBootTimeoutClearsTriggeredFlag, time.AfterFunc(time.Hour, func() {}) is only used to obtain a *time.Timer, but it leaves a long-lived timer and goroutine that will fire later in the background. To avoid extra goroutines and potential flakiness, create a short-lived timer and stop it immediately (e.g. timer := time.NewTimer(time.Second); timer.Stop()), or use time.AfterFunc(0, ...) and stop it right away. The delay doesn’t matter here—only the timer’s identity does—so prefer a timer that doesn’t linger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devices Specific device support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants