WebSocket Relay
The relay provides an always-on channel from the edge daemon to the Galois cloud backend. When the backend cannot reach the edge via the Tailscale overlay network (e.g., the edge is behind NAT or Tailscale is unavailable), it uses this channel to push instrument commands.
Configuration
Section titled “Configuration”| Variable | Default | Description |
|---|---|---|
RELAY_URL | derived | Explicit ws:// or wss:// URL. If empty, derived from BACKEND_URL. |
BACKEND_URL | — | If set, relay URL is wss://<host>/api/v1/relay/ws. |
REGISTRATION_TOKEN | — | Bearer token used to authenticate the relay connection. |
Set RELAY_URL="" explicitly to disable the relay entirely. If both
RELAY_URL and BACKEND_URL are empty the relay goroutine is never started.
Auth header
Section titled “Auth header”The registration token is sent in the HTTP Authorization: Bearer <token>
header on the initial WebSocket Upgrade request. The token is not appended
to the URL query string, which avoids logging it in reverse-proxy access logs.
Wire Protocol
Section titled “Wire Protocol”All frames are JSON text messages (text WebSocket opcode). Every message has
a type field that selects the frame variant; all other fields use
omitempty, so unused keys are absent from the wire.
Handshake
Section titled “Handshake”- Daemon connects; backend validates token from the
Authorizationheader. - Daemon sends
hello. - Backend optionally sends
hello_ack(see hello_ack). - If the backend rejects the session it sends a close frame with a structured application code (see Unrecoverable close codes).
Message reference
Section titled “Message reference”hello (edge → backend)
Section titled “hello (edge → backend)”Sent once immediately after the WebSocket handshake completes, before any other traffic.
{ "type": "hello", "edge_id": "d3f1a2b4-...", "edge_name": "pi5-lab", "version": "0.9.1"}| Field | Type | Description |
|---|---|---|
edge_id | string | Registration UUID of this edge. |
edge_name | string | Human-readable edge name from config. |
version | string | Daemon semver string. |
The auth token is carried in the HTTP Authorization header of the Upgrade
request, not in this frame.
heartbeat (edge → backend)
Section titled “heartbeat (edge → backend)”Sent every 30 s while the session is alive.
{ "type": "heartbeat", "timestamp_ms": 1746556800000}| Field | Type | Description |
|---|---|---|
timestamp_ms | int64 | Unix millisecond timestamp at time of send. |
command_request (backend → edge)
Section titled “command_request (backend → edge)”The only inbound frame type the daemon acts on. All other unknown types are
logged at DEBUG and discarded without closing the session.
{ "type": "command_request", "request_id": "uuid-456", "instrument_id": "GPIB0::22::INSTR", "command_name": "measure_voltage", "parameters": {"range": "10"}, "is_query": true}| Field | Type | Description |
|---|---|---|
request_id | string | UUID; echoed back in the corresponding command_response. |
instrument_id | string | VISA resource string (e.g. GPIB0::22::INSTR). |
command_name | string | Name of the instrument capability to invoke. |
parameters | object | Key/value string map of command parameters. |
is_query | bool | true if a response value is expected. |
Each command_request is handled in its own goroutine. Slow instruments do
not block the heartbeat or other commands.
command_response (edge → backend)
Section titled “command_response (edge → backend)”Sent after each command_request is processed. One response per request,
always sent (success or failure).
{ "type": "command_response", "request_id": "uuid-456", "success": true, "data": "1.234", "scpi_command": "MEAS:VOLT:DC?", "execution_time_ms": 45}On failure, success is false, data is absent, and error_message is
populated.
| Field | Type | Description |
|---|---|---|
request_id | string | Echoed from the request. |
success | bool | true if the command executed without error. |
data | string | Result value (omitted if success is false). |
error_message | string | Human-readable error (omitted if success is true). |
scpi_command | string | Raw SCPI string that was sent (if applicable). |
execution_time_ms | int64 | Wall-clock time from request receipt to response send. |
hello_ack (optional, backend → edge)
Section titled “hello_ack (optional, backend → edge)”An optional server-sent acknowledgement after the backend validates the
hello frame. Carries a session_id for log correlation.
{ "type": "hello_ack", "session_id": "sess-abc123"}This feature requires both a daemon build with the relay_hello_ack build tag
and a matching backend that sends hello_ack. See
hello_ack feature flag below.
Reconnect Behavior
Section titled “Reconnect Behavior”The relay reconnects automatically with exponential backoff:
- Formula:
min(2s × 2^(attempt−1), 5 min) × (1 + rand(0, 0.25)) - Initial delay: 2 s
- Cap: 5 minutes
- Jitter: up to 25% extra (avoids thundering herd)
The backoff counter is reset to 0 after each clean session ends, so a daemon that runs successfully for hours reconnects promptly after a transient drop rather than waiting at the cap.
Unrecoverable Close Codes
Section titled “Unrecoverable Close Codes”When the backend closes the WebSocket with one of the following codes the daemon logs an error and does not retry. Operator action is required.
| Code | Meaning | Daemon action |
|---|---|---|
4401 | Bad or expired registration token | Log ERROR, exit relay goroutine |
4403 | Edge not registered on this backend | Log ERROR, exit relay goroutine |
4426 | Protocol version mismatch | Log ERROR, exit without operator action |
1008 | RFC 6455 policy violation (catch-all auth failure) | Log ERROR, exit relay goroutine |
For all other close codes the daemon reconnects with backoff.
hello_ack Feature Flag
Section titled “hello_ack Feature Flag”The hello_ack handshake step is gated behind a Go build tag because it
requires a coordinated backend change. The default build does not wait for
hello_ack and remains backward-compatible with older backends.
To enable the feature:
# Build the daemon with hello_ack supportgo build -tags relay_hello_ack ./cmd/galois-edge
# Run tests with the feature enabledgo test -tags relay_hello_ack ./internal/relay/...When enabled, the daemon waits up to 10 s for hello_ack after sending
hello. If the first frame is not hello_ack, the session is closed and
retried with backoff. This gives the backend a hook to return a session ID for
log correlation and a place to signal session-level errors after a successful
WebSocket upgrade.
Backend coordination required. Before flipping this tag in production:
- Deploy a backend version that sends
{"type":"hello_ack","session_id":"..."}after validatinghello. - Roll out the daemon build with
-tags relay_hello_ack. - Verify
session_idappears in daemon logs and cloud logs for correlation.
Security Notes
Section titled “Security Notes”- The registration token is never logged by the daemon.
- The token is not appended to the URL query string, avoiding exposure in reverse-proxy logs.
- In-flight
command_requestgoroutines are cancelled when the relay session ends (e.g., on auth failure), preventing continued gRPC dials against a dead socket.