MESH ONLINECODENAME: Purple Rain

Net v0.27.7 — "Purple Rain"

A NAT-traversal & port-discovery security-hardening release. Two focused code reviews — the rendezvous/hole-punch runtime (adapter/net/traversal/** + mesh.rs) and the port-discovery surfaces (UPnP-IGD + the NAT-classification reflex sweep) — surfaced 13 findings; 11 landed across 17 commits and three review rounds, with the remaining 2 deliberately deferred. The headline fix closes an unauthenticated UDP-reflection vector on the rendezvous coordinator; the rest harden the NAT-classification sweep against flapping and tighten the discovery surfaces.

No wire-format change, no C-ABI change, no public API change. Drop-in against honest v0.27.6 and earlier peers. Full audits: docs/misc/CODE_REVIEW_2026_06_21_NAT_TRAVERSAL.md and docs/misc/CODE_REVIEW_2026_06_21_PORT_SCANNING.md.

[!IMPORTANT] NAT traversal is an optimization, not a correctness contract — and that held under review. Every finding below falls back to the routed handshake; none of them can corrupt or block a session. The fixes matter most to nodes that hole-punch through a public rendezvous coordinator: the reflector was reachable by an authenticated mesh peer, with the value to an attacker being source obfuscation rather than amplification.

Highlights

  • Closed the coordinator-mediated UDP reflectorPunchRequest.self_reflex is now bound to the requester's session source IP, so an attacker can no longer name a third-party victim address for the responder to fire keep-alives at.
  • Made NAT classification single-flight (SweepGuard) and flap-resistant — a sweep with fewer than two successful probes no longer downgrades a good Cone/Open to Unknown.
  • Validated the keep-alive sender_node_id (the documented-but-dead anti-spoof field) and clamped the trusted fire_at_ms so a malicious coordinator can't park keep-alive sender tasks unbounded.
  • Hardened the UPnP-IGD discovery wrapper (local-IP contract enforced in debug + release) plus a set of docstring-accuracy corrections across the classification FSM and reflex-echo paths.
  • Confirmed the dual-use "scanner" question: none of the port-discovery surfaces can be aimed at an attacker-chosen host — SSDP hits only the fixed multicast group, NAT-PMP only the OS routing-table gateway, reflex probes only authenticated connected peers.
  • Rust changes are cargo check + clippy clean (default and port-mapping configs) with per-finding regression tests; new suites in punch_keepalive.rs and rendezvous_coordinator.rs.

High-severity fixes

  • Rendezvous UDP reflector — bind self_reflex to the requester's session IP (Medium). The punch coordinator took requester A's reflex verbatim from the unsigned PunchRequest body and forwarded it to responder B as the keep-alive target — A could put a victim's ip:port there, and B would emit three UDP keep-alives at it with A's identity hidden behind B. handle_punch_request now resolves A's session up-front and drops the request when req.self_reflex.ip() != a_addr.ip(), before the reflex is ever read. Symmetric-NAT port shifts stay honoured (the guard keys on IP only). This closes the coordinator-mediated path; the direct unsolicited-introduce path is tracked separately as deferred Finding 4 below.
  • NAT classification single-flight (SweepGuard) (Medium). reclassify_nat's docstring promised "at most one sweep at a time," but the body had no guard — and the method is pub and FFI-exported (net_mesh_reclassify_nat). An operator call concurrent with the background tick ran two sweeps that collided on the pending_reflex_probes map, silently starving the earlier sweep to a ReflexTimeout. Now serialized by an AtomicBool compare-exchange. (A follow-up review note clarified the gate is sweep-level — a standalone probe_reflex can still race a sweep's probe, but the collision is benign: waiters are generation-stamped, the loser resolves as ReflexTimeout, and the sub-2-observation guard below keeps such a sweep on its prior class rather than flapping.)
  • Sub-2-observation sweep no longer downgrades a good class to Unknown (Medium). A sweep that landed only one successful probe (routine packet loss on a two-peer sweep) fed a single observation into the FSM, which returns Unknown for < 2 observations — and the commit path only guarded latest_reflex == None, so it overwrote a previously-good Cone/Open with nat:unknown for a ~60 s window. The commit now keeps prior state when successful observations < 2, mirroring the existing deadline-expired anti-flap branch. A torn-input guard ((class, None) claiming ≥2 observations) is retained as defense-in-depth and pinned by test.

Other fixes (MEDIUM / LOW)

  • rendezvous keep-alive — the sender_node_id field, documented as load-bearing against "a stray packet on the right source addr falsely signalling punch-succeeded," was decoded but never validated. It is now checked in the receive loop, before the observer is removedpunch_observers carry the expected counterpart node_id and a remove_if fires only on a match. (A first cut put the check after the observer fired; cubic flagged it as a P2 DoS — a single stray packet burned the observer and failed the punch permanently — so the check was moved ahead of removal: a wrong-sender packet is now dropped without consuming the observer, and a later valid keep-alive still completes the punch.)
  • rendezvous fire_at_ms clamp — the offset math was extracted into a pure, unit-tested keepalive_send_offsets(fire_at_ms, now_ms, deadline) that clamps base_lead to punch_deadline and uses saturating adds for the +100/+250 ms spacing. A malicious or buggy coordinator can no longer park a keep-alive sender task (holding a socket Arc + payload) for an unbounded duration, and the Instant + Duration overflow panic risk is gone. Far-future and u64::MAX inputs covered for both clamping and panic-freedom.
  • UPnP-IGD discoveryUpnpMapper::new now enforces its documented local_ip contract (debug_assert in dev, tracing::warn! in release) so a non-routable bind IP can't silently produce a route-nowhere mapping; add_port_err_to_port_mapping (dead in production — install uses add_any_port) is marked test-only; and the intentional get_external_ip re-read on install is documented (the WAN IP can change between probe and install).
  • classification / reflex docstring accuracy — corrected the over-claims that drew the review: a wildcard bind (0.0.0.0) behind a port-preserving restricted-cone NAT is now documented as potentially over-classified Open (advertises nat:open → peer picks Direct → drops the unsolicited inbound → relay fallback; correctness holds, optimization lost); peer selection's lack of destination diversity (two node ids on one public IP can misread a symmetric NAT as Cone) is now caveated; and the reflex echo is documented as using the cached authenticated handshake addr (spoof-resistant, but stale on a mid-session NAT rebind) rather than the live — spoofable — packet source.

Investigated / deferred (not shipped)

  • Finding 4 — the direct unsolicited PunchIntroduce reflector (Low–Medium, Open). Finding 1's fix guards only the coordinator-mediated path. An authenticated session peer can still send responder B an unsolicited PunchIntroduce{ peer: <any>, peer_reflex: <victim> }; with no waiter for <any>, dispatch falls through to schedule_punch, which fires the keep-alive train at the wire-supplied peer_reflex unconditionally. The sender_node_id check gates the return PunchAck, not the keep-alive train. Lower-severity than the headline: reachable only by an authenticated mesh member, tiny payload, no amplification, and Finding 3's clamp now bounds each parked sender task to ≤ punch_deadline + 250 ms (bounded-lifetime churn, no unbounded accumulation). The fix — drop when intro.peer_reflex disagrees with the cached announced reflex of intro.peer — is promoted to a tracked follow-up.
  • Finding 5 — rate-limit budgets + RendezvousRejected wiring (Low, Open). There is still no per-requester budget on PunchRequest and no per-peer budget on responder keep-alive trains, so volume abuse over the still-open direct path is uncapped in count. TraversalError::RendezvousRejected / RendezvousNoRelay remain defined and FFI-mapped but never constructed — Finding 1's guard surfaces as a silent drop → PunchFailed timeout. Adding the planned budgets (the is_auth_throttled subscribe-auth infrastructure is the model) and surfacing both the rate-limit and IP-mismatch rejections as typed errors is deferred as a non-trivial change unsuitable for a hardening point release.

Dependencies

All in net/crates/net/Cargo.lockno Cargo.toml constraint change (only the workspace version stamp 0.27.6 → 0.27.7), so crates.io library consumers resolve identically:

  • Deck / TUI: ratatui 0.30.1 → 0.30.2, pulling its component crates (ratatui-core, ratatui-crossterm, ratatui-termwiz, ratatui-widgets, ratatui-macros) and adding the ratatui-termina / termina terminal backend. Reaches only the operator cyberdeck binary; nothing on the datapath or wire.
  • Transitive bumps: arrayvec 0.7.7, cc 1.2.65, log 0.4.33, redis 1.2.4. Build/utility crates and the optional Redis adapter — none reach the datapath, crypto, or wire.

Upgrade notes

  • Breaking changes: none on the wire, in the C ABI, or in the public Go/Python/Rust API. All changes are internal to the NAT-traversal and port-discovery paths, behind unchanged signatures.
  • One behavioural change to note: a PunchRequest whose self_reflex IP doesn't match the requester's session source IP is now silently dropped at an honest coordinator (the requester's request_punch times out as PunchFailed and falls back to relay). This is correctness-preserving — relay fallback is the documented path — but two acknowledged edges can drop a legitimate requester to relay: IPv4-mapped-IPv6 / multi-public-IP CGNAT pools, and a requester R reaches via a relay (the guard would compare against the relay IP). Both are inherent to "bind IP, allow any port for symmetric NAT" and fall back cleanly.
  • No general-purpose port scanner exists or was added. The net port CLI remains an explicit design stub; the reviewed "scanning" surfaces (UPnP SSDP, NAT-PMP, reflex probes) cannot be aimed at an attacker-chosen target.

Full Changelog: https://github.com/ai-2070/net/compare/v0.27.6...v0.27.7