24 KiB
Design Audit: radicle-reticulum
April 2026 — based on codebase at src/radicle_reticulum/ and protocol documentation
1. What Radicle Needs (Protocol Summary)
Transport
Radicle Heartwood runs a single TCP listener on port 8776 (configurable). Every peer-to-peer session is a persistent, full-duplex TCP stream. There is no UDP, no QUIC; TCP is non-negotiable.
Session handshake
After TCP connection, the two nodes perform a Noise XK handshake — the same pattern used by the Lightning Network. The initiating node knows the responder's static public key (the Node ID / NID, a did:key:z6Mk… Ed25519 key) before connecting. After the handshake both ends have a shared symmetric key and a verified identity. Everything from that point on is encrypted and authenticated.
Application protocol on top of Noise
Over the established Noise session, Radicle runs a gossip + multiplexed git protocol:
- Hello / version exchange — negotiated immediately after handshake.
- Gossip messages (three types):
- Node Announcement — broadcasts NID and reachable address(es); enables peer discovery and routing table updates.
- Inventory Announcement — broadcasts the list of repository IDs (RIDs) this node hosts; received by all connected peers who relay it further.
- Ref Announcement — broadcasts that a particular ref changed in a particular RID; relayed only to nodes that seed that RID.
- Git fetch / upload-pack — multiplexed over the same TCP+Noise stream using the raw Git smart-HTTP wire protocol. When a node receives a ref announcement for a repo it seeds, it opens a git-fetch sub-session to the announcing node's TCP socket.
The gossip messages are framed with a length prefix and serialised (the exact codec is CBOR in the current implementation, though the public docs describe it as "compact binary"). The Git sub-sessions use Git's native pkt-line / pack-protocol framing, negotiated inline.
What rad node connect does
rad node connect <NID>@<host>:<port> tells the local radicle-node daemon to:
- Open a TCP connection to
<host>:<port>. - Perform the Noise XK handshake using
<NID>as the expected remote static key. - Send a Hello message and enter the gossip loop.
- From that point the two nodes exchange inventory and ref announcements, and trigger git-fetches as needed.
The call is persistent-session setup, not a one-shot sync. Once two nodes are "connected peers" they stay connected and sync automatically as refs change.
What rad push / rad fetch / rad sync do
All three commands speak to the local radicle-node daemon over a Unix socket (not over the network directly). The daemon then handles the network side:
rad push— writes new commits to local storage, then the daemon sends a ref announcement to all connected peers.rad fetch/rad sync --fetch— asks the daemon to pull refs for a given RID from known seeds; the daemon opens git-fetch sub-sessions over existing (or newly established) Noise sessions.
The user-facing commands have no network code of their own. They are thin CLI wrappers around the local daemon IPC. This is the critical insight for the bridge: the bridge only needs to make radicle-node believe it has a working TCP peer. All gossip, inventory management, and git transfer happen inside radicle-node itself once the TCP connection is in place.
Peer discovery (without the bridge)
On the public internet, radicle-node bootstraps from two hard-coded seed DNS names (seed.radicle.garden:8776, seed.radicle.xyz:8776). From those seeds it learns about other peers through Node Announcements. The seeds relay Inventory Announcements so that every node eventually knows which node hosts which repo.
2. What Reticulum Provides (Relevant Primitives)
Addressing and routing (no configuration needed)
Every RNS node has a 128-bit destination hash derived from its public key plus application-name aspects. Routing is entirely source-agnostic: transport nodes relay packets one hop closer to the destination hash without knowing the full path. A new node discovers reachable peers within ~1 minute purely through the announce mechanism.
Announce mechanism
destination.announce(app_data=...) propagates a signed packet across all interfaces with a 2% bandwidth cap per interface. Transport nodes re-broadcast with randomised delays. Any node can embed arbitrary app_data (up to ~400 bytes on LoRa) in the announce. This is free peer discovery — no tracker, no DNS seed, no configuration.
RNS.Link — encrypted, verified, bidirectional channel
RNS.Link(destination) performs a 3-packet (297 bytes total) ECDH handshake and then provides:
- Encrypted, forward-secret bidirectional channel.
- Per-packet delivery confirmation via signed proof.
- Callbacks:
set_packet_callback,set_link_closed_callback,set_link_established_callback. link.identify(identity)— reveal the initiator's identity to the responder inside the encrypted channel.
This is conceptually equivalent to Radicle's Noise XK session but implemented by RNS transparently.
RNS.Packet — fire-and-forget (< ENCRYPTED_MDU = 383 bytes)
Used for small messages that fit in a single LoRa frame. No delivery guarantee. RNS.Packet(dest_or_link, data).send(). Suitable for gossip notifications.
RNS.Resource — reliable large-data transfer
RNS.Resource(data_or_filehandle, link, callback=...) handles arbitrary-size reliable transfer over an established Link: automatic chunking, sequencing, compression, integrity check. This is the right primitive for git pack transfers; it avoids the per-packet 383-byte limit and provides end-to-end reliability. Not currently used anywhere in the codebase.
RNS.Channel / RNS.Buffer — streaming over a link
link.get_channel() returns a Channel for typed message exchange. RNS.Buffer.create_bidirectional_buffer(...) wraps a Channel in Python file-like IO objects (BufferedRWPair). This enables streaming reads and writes exactly like a TCP socket — a natural fit for tunnelling Radicle's persistent TCP stream.
Interface diversity
RNS handles LoRa (RNode), packet radio (KISS/AX.25), TCP, UDP, I2P, serial — through the same API. The application code does not change between interfaces.
LoRa specifics (RNode interface)
- Typical LoRa physical data rate: 1.2–5.5 kbps (SF7–SF12, 125 kHz BW).
- Duty cycle limits in Europe: 1–10 % per sub-band, meaning an SF12 node may transmit at most ~36 seconds per hour.
- RNS caps announce bandwidth at 2% per interface, which on a 1.2 kbps LoRa link is ~2.4 bps — one announce every few minutes.
RNS.Packet.ENCRYPTED_MDU = 383 bytesper packet.- Link establishment costs 297 bytes (2–3 frames at SF12).
3. Current Code: What's Right, What's Redundant, What's Missing
What's right
bridge.py — RadicleBridge
This is the core value of the project and it is essentially correct. The design — listen on TCP, accept from radicle-node, open RNS link to remote bridge, forward bytes bidirectionally — is the minimal correct architecture. Key good decisions:
- Reuses an existing RNS instance (
RNS.Reticulum.get_instance()). - Embeds the radicle NID in announce
app_data, so remote bridges know which NID to register without a separate handshake. - Chunks TCP reads to
RNS.Packet.ENCRYPTED_MDU(383 B) in_send_over_link— fixes the real LoRa blocker. - Per-bridge dedicated TCP server ports — avoids multiplexing confusion if multiple remote bridges are discovered.
- State persistence (
_save_state/_load_state) for NID-to-bridge-hash mapping survives restarts. - Path maintenance loop warms RNS routing table so connections are not delayed.
- Reconnect logic in
_forward_tcp_to_rnshandles transient link drops.
gossip.py — GossipRelay
The gossip relay addresses a real gap: radicle-node only polls/syncs when it already knows about a peer. The gossip relay provides a lightweight side-channel to wake up remote nodes when local refs change, without sending git data over LoRa. This is good design. Specific strengths:
- Watchdog inotify integration (
_start_watcher) for instant detection on push. - Debounce (
WATCHDOG_DEBOUNCE = 2.0s) absorbs multi-commit push bursts. - MDU-aware payload splitting in
_build_ref_payloads— one ref change = one or a few 383-byte packets. - Delta vs. full broadcast distinction reduces bandwidth.
auto_seedandauto_discovermake the seed mode zero-configuration.
seed.py — SeedNode
Correct and minimal: spawns a separate radicle-node process with its own RAD_HOME and a permissive seedingPolicy. Using DEVNULL for stdout/stderr prevents pipe buffer deadlock — a real bug that was fixed.
identity.py — RadicleIdentity
The DID ↔ RNS identity mapping is well-implemented. load_or_generate for persistent identity across restarts is exactly right. The clear documentation of the RNS identity vs. destination hash distinction is accurate.
cli.py
--lora shortcut flag for conservative announce delays and longer poll intervals is a good UX touch. cmd_setup health-checker is useful.
What's redundant
adapter.py — RNSTransportAdapter
This module is entirely superseded by bridge.py and should be deleted. It implements a parallel peer-discovery and connection mechanism using APP_NAME/ASPECT_NODE destinations that nothing in the working system uses. It creates its own RNS instance unconditionally (line 69) which will conflict if RadicleBridge is also running. The announce_repository method, ASPECT_REPO destinations, and the connect/connect_to_peer plumbing are all dead code. The cmd_node, cmd_ping, and cmd_peers CLI commands in cli.py use this adapter; these commands are vestigial and do not contribute to the bridge-based architecture.
link.py — RadicleLink
This wrapper around RNS.Link is not used by bridge.py or gossip.py. bridge.py calls RNS.Link directly and manages callbacks inline. RadicleLink adds a deque receive buffer and a recv(timeout) blocking call — which implies a request/response programming model that doesn't fit the streaming tunnel. The file can be deleted unless a future protocol layer needs it.
messages.py
The binary framing layer (NodeAnnouncement, InventoryAnnouncement, RefAnnouncement, Ping, Pong) duplicates what Radicle already does natively. radicle-node sends its own Node and Inventory Announcements over the Noise session. gossip.py uses JSON-over-RNS.Packet for its ref-change notifications (not this module). messages.py is used only by cmd_ping in cli.py — itself a dead command. This file can be deleted.
Identity mismatch
RadicleIdentity generates a fresh Ed25519 keypair for the RNS bridge identity. This RNS key has no relationship to the Radicle Node ID (NID). The bridge announces the Radicle NID as a string in app_data, but the RNS identity is unrelated. This is fine architecturally — the bridge is a transparent proxy, not a Radicle node — but it means:
- You cannot derive the RNS bridge hash from a Radicle NID (they are unrelated).
- The
from_didpath inRadicleIdentity(which tries to import a Radicle DID as an RNS identity) is impossible to use correctly: RNS needs both Ed25519 signing and X25519 encryption keys, while Radicle DIDs carry only an Ed25519 public key. This path should be removed or clearly documented as unsupported.
RNS.Packet for streaming tunnel data
bridge.py currently uses RNS.Packet(link, chunk).send() in a loop to stream TCP data over the RNS link. This works but is not optimal:
- Each packet is a distinct encrypted unit with its own AES-256-CBC + HMAC overhead.
- Packet ordering over a Link is guaranteed by RNS, but delivery is not —
RNS.Packetover a Link is best-effort (no automatic retransmission at the packet level; only Resources provide that). - For Radicle's Noise session over the tunnel, dropped packets will corrupt the stream, causing the Noise session to fail and
radicle-nodeto disconnect. - The reconnect logic in
_forward_tcp_to_rnshandles link drops (link went down entirely) but not within-link packet loss.
The correct primitive for streaming byte data over an RNS Link is RNS.Buffer (wrapping RNS.Channel), which provides ordered, reliable delivery. This is the most important correctness gap in the bridge.
What's missing
-
RNS.Buffer/RNS.Channelfor the tunnel stream (critical — see above). Replace_send_over_link/_on_rns_datawith aBufferedRWPairin_handle_local_connectionand_on_incoming_link. The buffer handles chunking, ordering, and reliability automatically. -
RNS.Resourcefor initial git clone / large pack transfers. When a node first syncs a repo (initial clone, large commit), the pack can be tens or hundreds of megabytes.RNS.Resourcehandles this with compression, sequencing, and checksumming. The bridge doesn't need this if it's purely a TCP proxy (the radicle-node-to-radicle-node git transfer goes through the tunnel naturally), but it matters for LoRa where the TCP tunnel is too slow for large transfers. -
No flow control on the TCP→RNS direction. Currently
_forward_tcp_to_rnsreads TCP as fast as available and sends RNS packets without any backpressure. If the RNS path is slow (LoRa), the TCP socket buffer fills, TCP flow control kicks in againstradicle-node, which may time out its side of the connection. Need eitherRNS.Buffer(which handles this) or explicit rate limiting. -
RNS.Linkper TCP connection is expensive on LoRa. Every new TCP connection fromradicle-nodetriggers a full RNS link establishment (297-byte handshake = 2–3 LoRa frames). For a singlerad syncsession,radicle-nodeopens one connection and keeps it, so this is fine. But ifradicle-nodeopens multiple parallel connections (e.g., for concurrent repo syncs), each gets its own link. A future optimisation is link multiplexing viaRNS.Channelstreams over a single link. -
No handling of radicle-node restart. If
radicle-noderestarts, it forgets all connected peers. The bridge detects this via TCP error and closes tunnels, but it does not re-register known NIDs with the newradicle-nodeinstance._load_stateruns on bridge startup, not on radicle-node reconnect. A watchdog that polls the Unix socket or attemptsrad node connectperiodically would fix this. -
The
RNSTransportAdapter(adapter.py) and theRadicleBridgeboth register announce handlers viaRNS.Transport.register_announce_handler. If both are running (e.g., viacmd_nodestarted alongside the bridge), every announce fires both handlers. This is harmless but wasteful.
4. Recommended Architecture
Minimum viable rewrite
The bridge architecture is correct. The rewrite goal is to make it more correct and simpler, not to add features. The recommended target:
radicle-node (TCP 8776)
↕ localhost TCP (one connection per peer session)
RadicleBridge (bridge.py — keep, refine)
↕ RNS.Buffer over RNS.Channel over RNS.Link
Remote RadicleBridge (same code)
↕ localhost TCP
radicle-node (TCP 8776)
What to keep
bridge.py— keep, replaceRNS.Packetstream withRNS.Buffergossip.py— keep as-is; correct and completeseed.py— keep as-isidentity.py— keep, but remove or clearly gate thefrom_didpath (it cannot produce a usable RNS identity)cli.py— keepcmd_bridge,cmd_seed,cmd_gossip,cmd_setup; removecmd_node,cmd_ping,cmd_peers
What to cut
adapter.py— delete entirelylink.py— delete (not used by the working path)messages.py— delete (binary gossip framing not used; Radicle handles this natively)__init__.py— remove exports forRNSTransportAdapter,RadicleLink,MessageType,NodeAnnouncement,InventoryAnnouncement,RefAnnouncement,Ping,Pong,decode_message
The one structural fix: RNS.Buffer for the tunnel
In bridge.py, replace _send_over_link and _on_rns_data with Buffer-based IO:
# On outbound connection (_handle_local_connection):
channel = rns_link.get_channel()
buf = RNS.Buffer.create_bidirectional_buffer(
receive_id=0, send_id=1, channel=channel,
ready_callback=lambda n: _drain_buffer_to_tcp(tunnel_id, n)
)
# tunnel.buf = buf
# Forward TCP→RNS: tcp_socket.recv() → buf.write()
# Forward RNS→TCP: ready_callback → buf.read() → tcp_socket.sendall()
# On incoming link (_on_incoming_link): mirror with swapped IDs (receive_id=1, send_id=0)
This single change provides ordered, reliable delivery and eliminates the packet-loss-corrupts-stream problem.
Flow: what happens on rad push
- User runs
rad pushin their checkout. radCLI writes the new commits to~/.radicle/storage/, then tells the localradicle-nodedaemon via Unix socket.radicle-nodesends a Ref Announcement over all its active TCP connections.- The bridge has one TCP connection to local
radicle-node(on the bridge's per-peer listen port). This is the incoming side of the bridge from radicle-node's perspective (radicle-node is the caller; the bridge allocated that port and registered the peer NID viarad node connect).
Wait — this is worth clarifying. There are two directions:
Outbound sync (local node to remote):
- Remote bridge discovers local bridge via RNS announce.
- Remote bridge calls
rad node connect NID@127.0.0.1:<port>on its localradicle-node. - Remote
radicle-nodeopens TCP to that port. - Remote bridge's accept loop picks it up, opens an RNS Link to the local bridge.
- Local bridge's
_on_incoming_linkfires, opens TCP to localradicle-nodeat port 8776. - The session is now: remote
radicle-node↔ remote bridge ↔ RNS ↔ local bridge ↔ localradicle-node.
Push propagation:
- Local user runs
rad push→ localradicle-nodeemits Ref Announcement on all sessions. - One of those sessions goes through the bridge tunnel.
- Remote
radicle-nodereceives the Ref Announcement; if it seeds the repo, it initiates git-fetch back on the same session (same TCP connection, multiplexed by the Radicle protocol). - GossipRelay detects the local ref change independently (via inotify or poll), sends a lightweight RNS packet to all known gossip peers.
- Remote gossip relay receives this, calls
rad sync --fetch --rid <RID>→ remoteradicle-nodepulls via the existing bridge session.
The gossip layer is a belt-and-suspenders trigger: if the bridge TCP session is active, radicle-node gets the Ref Announcement natively and syncs automatically. The gossip relay is useful for nodes that are not currently bridged (bridge is down, no active RNS link) — they receive the gossip packet and re-establish the bridge + do a manual sync.
What radicle-rns should expose
The minimal UX is:
radicle-rns seed # on always-on nodes (combines radicle-node + bridge + gossip)
radicle-rns bridge # on user laptops (bridge only; user's radicle-node handles their own storage)
rad push, rad fetch, rad sync require no changes. They speak to the local daemon as always. The daemon believes it has normal TCP peers. The bridge is invisible.
5. LoRa-Specific Considerations
What is realistic over LoRa
LoRa at SF7/125 kHz gives about 5.5 kbps physical; SF12 (max range) is about 290 bps. After duty cycle (1% in EU868), RNS overhead, and RNS announce cap (2%), the effective throughput for application data is:
| SF | Physical rate | Practical throughput | Time for 1 MB |
|---|---|---|---|
| SF7 | 5.5 kbps | ~4 kbps | ~33 min |
| SF10 | 1.2 kbps | ~800 bps | ~2.5 hr |
| SF12 | 290 bps | ~180 bps | ~11 hr |
Feasible over LoRa:
- Gossip ref-change notifications (a few hundred bytes per event) — always feasible
- Small commits with small pack objects (< 50 KB) — feasible at SF7, slow at SF12
- Ref announcements and node discovery — feasible, handled by RNS announce
- Link establishment (297 bytes) — 3 frames at SF12, under 1 second at SF7
Not feasible over LoRa:
- Initial clone of any non-trivial repository (pack objects typically 1–100 MB)
- Large commit batches (many changed files, binary assets)
- Frequent polling (gossip poll interval should be >= 120s on LoRa, not the default 30s — the
--loraflag correctly sets this)
Recommended LoRa workflow
- Initial clone via fast link (WiFi, Ethernet, internet):
rad clone rad:<RID>in the normal way. - Incremental sync over LoRa: subsequent
rad push/rad syncfor small commits. A 1-commit diff is typically 5–50 KB of pack data — a few minutes at SF10. - Gossip relay always running: on LoRa, the gossip relay is more important than the bridge because it can send a 300-byte "go fetch" signal even when the bridge TCP session is not live. The bridge then re-establishes only when there is data to transfer.
- LoRa-safe announce delays: the
--loraflag sets announce delays to60,300,900seconds. This matters because LoRa duty cycle limits mean frequent announces drain the airtime budget.
The case for RNS.Resource on LoRa
For medium-sized pack objects (1–500 KB), streaming them as raw TCP through the bridge is fragile: if a single RNS packet drops, the Noise session errors and radicle-node disconnects. RNS.Resource retransmits failed segments automatically. For the LoRa case, the recommended approach is:
- Intercept git pack data at the bridge layer (parse git pkt-line to detect pack boundaries).
- Transfer pack objects as
RNS.Resourceinstead of streaming TCP bytes. - Re-inject on the remote side before forwarding to
radicle-node.
This is a significant complexity increase and may not be worth it for an initial version. The simpler alternative is to rely on TCP retransmission: if the tunnel drops, TCP times out, radicle-node retries, and the bridge re-establishes the link. This works but results in poor user experience (multi-minute timeouts on LoRa).
The minimum correct fix (RNS.Buffer instead of RNS.Packet for streaming) makes the bridge reliable over all media including LoRa, because RNS.Channel (which Buffer uses) provides per-message acknowledgement and retransmission.
Airtime budget example (EU868, SF10)
- RNS link establishment: 297 bytes → ~2 seconds airtime
- Gossip ref notification: ~300 bytes → ~2 seconds airtime
- 10 KB pack object: ~26 frames × 2s = ~52 seconds airtime (inside 1% duty cycle budget for 5200 seconds)
- 100 KB pack object: ~10 minutes of airtime, needs ~17 hours of calendar time at 1% duty cycle
These numbers confirm: LoRa is viable for ref notifications and small commits, and impractical for initial clones or large repos.
6. Summary Table
| Item | Status | Recommendation |
|---|---|---|
bridge.py — TCP↔RNS tunnel |
Correct architecture, packet-loss gap | Keep; switch to RNS.Buffer |
bridge.py — announce + NID in app_data |
Correct | Keep |
bridge.py — per-bridge TCP ports |
Correct | Keep |
bridge.py — state persistence |
Correct | Keep |
bridge.py — path maintenance loop |
Correct | Keep |
bridge.py — reconnect on link drop |
Correct | Keep |
gossip.py — ref-change relay |
Correct and necessary | Keep |
gossip.py — inotify + debounce |
Correct | Keep |
gossip.py — MDU-aware splitting |
Correct | Keep |
seed.py — seed node manager |
Correct | Keep |
identity.py — RNS identity persistence |
Correct | Keep; remove from_did path |
adapter.py — RNSTransportAdapter |
Dead code, conflicts | Delete |
link.py — RadicleLink wrapper |
Dead code | Delete |
messages.py — binary gossip frames |
Dead code, duplicates Radicle | Delete |
cmd_node, cmd_ping, cmd_peers |
Use dead adapter | Remove |
| Custom gossip protocol | Radicle handles natively over bridge | Remove (messages.py) |
RNS.Packet for stream data |
Best-effort, packet loss corrupts stream | Replace with RNS.Buffer |
RNS.Resource for large transfers |
Not implemented | Consider for LoRa path |
| Initial clone over LoRa | Impractical | Document; clone over fast link first |