Finding the optimal communication path between LoRa mesh devices — before a single byte of payload is transmitted.
by Clemens Simon
Current Meshtastic flooding wastes 92–100% of bandwidth in small-medium networks and fails to deliver at scale. System 5 combines geo-clustering, multi-path routing, and adaptive QoS into one self-healing protocol. For networks up to ~200 nodes: 100% delivery with 92–99.9% less bandwidth. At 500+ nodes: higher delivery than managed flooding (76% vs 51% at 500 nodes), though large sparse networks remain challenging for any protocol. The hop limit — today's biggest scaling bottleneck — becomes irrelevant: each hop costs ~1 transmission instead of n.
Meshtastic uses naive flooding: every node rebroadcasts every packet. This works for tiny networks but collapses at scale.
Every message is rebroadcast by every node that receives it. A single message to one recipient generates n transmissions across the entire network.
1–50 kbps bandwidth. 1% duty cycle (EU law). Half-duplex radio. Each packet takes 50ms–2s airtime. Budget: ~36–720 packets/hour/node.
Every node transmits on every message — even nodes nowhere near the intended path. Battery-powered devices drain in hours instead of weeks.
Multiple nodes rebroadcast simultaneously. LoRa is half-duplex — collisions destroy packets, triggering more retransmissions. A vicious cycle.
Meshtastic caps hops at 3–7 to prevent flood storms. But this kills range: a message can't reach nodes beyond the limit. Every extra hop multiplies transmissions by n — so the limit can't be raised without drowning the network.
Same network topology, four different routing strategies. Watch the TX counter — it tells the whole story.
Theoretical worst case — every node rebroadcasts everything. Not actually used by Meshtastic, but useful as a reference.
Source sends a packet. Every receiving node rebroadcasts it once. The entire network participates in every message.
TX = n (one per node)
The actual Meshtastic approach. SNR-based suppression: distant nodes rebroadcast first, close nodes suppress. ROUTER nodes always rebroadcast.
Before rebroadcasting, each node listens briefly. If it hears another node already rebroadcast, it suppresses its own transmission. Distant nodes (low SNR) get shorter delays and rebroadcast first. Close nodes wait and often suppress.
ROUTER-role nodes (marked R) override suppression — they always rebroadcast to ensure backbone coverage.
TX ≈ 0.4n – 0.6n (~50% suppression)
New in v2.6 — for direct messages only. Learns the relay node, then sends only via that one node. Falls back to flooding if it fails.
Phase 1: First message uses managed flooding. The system tracks which node successfully relayed.
Phase 2: Subsequent messages go only via the learned next-hop node (marked NH). One relay instead of the whole network.
Phase 3: If the next-hop dies, the system falls back to managed flooding and learns a new relay.
Only works for direct messages (unicast). Broadcasts still use managed flooding. Only learns one hop — not a full path.
Geo-clustering + multi-path + weighted load balancing + adaptive QoS + self-healing. ~1 TX per hop for all message types.
W(r) = α·Q(r) × β·(1−Load(r)) × γ·Batt(r)
Share(r) = W(r) / Σ W(all)
Managed flooding suppresses ~50% of rebroadcasts but still scales as O(n). System 5 routes along specific paths — cost scales with hop count, not network size. At 100 nodes: managed flood = ~1,500 TX per message, System 5 = ~2 TX.
Next-hop learns a single relay node for direct messages. System 5 maintains 2-3 full paths with weighted load distribution for all traffic types. When a path fails, the next cached path activates instantly — no flooding fallback needed.
Formal evaluation of all four routing approaches across seven weighted criteria.
Every node rebroadcasts once. At n=100: 100 transmissions per message. Cost grows linearly with network size.
S = suppression rate (fraction of nodes that hear a rebroadcast and stay silent). Depends on density and SNR distribution. At n=100 with S=0.5: ~50 transmissions. Still O(n) but ~50% cheaper than naive.
First message floods (managed). After learning: d = hop count to destination via cached relay. Amortized cost depends on cache hit rate. Broadcasts still use managed flooding.
Every message — unicast and broadcast — follows a pre-computed path. Cost = hop count, independent of network size. At n=100, d=2: 2 transmissions. With fallback: scoped cluster flooding adds O(cluster_size) in worst case.
Q = link quality (OGM reception rate), Load = queue pressure, Batt = min battery along route.
Traffic share: Share(r) = W(r) / Σ W(all). Tuning: α=0.4, β=0.35, γ=0.25
| Criterion (Weight) | Naive Flood | Managed Flood | Next-Hop | System 5 |
|---|---|---|---|---|
| TX Cost per Message (20%) | 1 | 4 | 5 | 10 |
| Delivery Reliability (20%) | 9 | 9 | 8 | 9 |
| Scalability (15%) | 1 | 3 | 4 | 9 |
| Fault Tolerance (15%) | 8 | 8 | 7 | 9 |
| Hop Limit Freedom (10%) | 1 | 2 | 3 | 10 |
| Energy Efficiency (10%) | 1 | 3 | 5 | 8 |
| Broadcast Support (10%) | 10 | 10 | 3 | 9 |
| WEIGHTED TOTAL | 4.3 | 5.5 | 5.1 | 9.2 |
Meshtastic's managed flooding is clever — but still O(n). System 5 borrows six proven concepts from networking research to achieve O(hops).
Nodes self-organize by geohash prefix. Full topology within cluster, summarized routes between. Scales from 10 to 10,000+ nodes.
Periodic originator messages. Count reception rate per neighbor. No complex calculation — just count how many arrive.
Traffic distributed proportionally to route weight. Good paths get more traffic, but never all. No single bottleneck node.
Overloaded nodes report queue pressure. Traffic naturally avoids congested paths — like water flowing around rocks.
Successful deliveries strengthen a route. Timeouts weaken it. Unused routes fade naturally. The network learns.
Where is the target node? Ask locally first, then cluster, then region. Answers are cached. Scoped flooding only as last resort.
In flooding, every hop multiplies transmissions across the entire network. In System 5, every hop costs exactly one transmission. This single change unlocks everything.
Every node rebroadcasts at every hop. At 100 nodes and 5 hops, a single message generates 330,000+ transmissions. The hop limit (default 3) is a survival mechanism — without it, the network drowns.
Only the forwarding node transmits. At 100 nodes and 5 hops, a single message generates 5 transmissions. No hop limit needed — 20 hops cost the same as flooding costs for 1.
| Scenario | Nodes | 3-hop Del. | 5-hop Del. | 7-hop Del. | Sys5 Del. | Sys5 TX |
| Small Local | 20 | 100% | 100% | 100% | 100% | 115 |
| Medium City | 100 | 92% | 100% | 100% | 100% | 402 |
| Large Regional | 500 | 14% | 31% | 51% | 76% | 412k |
| 1000 Nodes | 1000 | 2% | 6% | 6% | 43% | 182k* |
| 1500 Nodes | 1500 | 2% | 4% | 5% | 42% | 197k* |
* At 1000+ nodes, System 5 uses more total TX than managed flooding (182k vs 78k) — but delivers 7x more messages. The per-delivered-message cost (TX/delivered) still favors System 5: ~4,229 vs ~13,065.
Critical finding: With Meshtastic's realistic hop limit (3–7), managed flooding's delivery rate collapses at scale. At 1000 nodes, only 6% of messages arrive — regardless of hop limit. System 5 delivers 7x more messages in the same scenario. The hop limit doesn't just cap range — it makes large networks fundamentally unreliable.
No more artificial hop limits. Messages can traverse 20, 30, or 50 hops at the same per-hop cost. The network's range is limited only by node density, not by protocol constraints.
Only the forwarding node transmits per hop — not every node in range. Nodes far from the path sleep through. Battery life increases from hours to weeks.
With cheap hops, SHORT_FAST with more hops works as well as LONG_SLOW with fewer hops — at higher data rates. Choose the preset for your local conditions, not for the network's range limit.
Watch step by step how nodes discover each other, form clusters, elect border nodes, and establish multi-path routes.
This animation shows how a System 5 mesh network self-organizes from powered-on nodes to a fully routed, load-balanced mesh.
Three scenarios showing how System 5 adapts across vastly different scales. Watch the node activity logs on the right.
12 nodes within ~3km. Direct LoRa links. Single geo-cluster. Full internal topology known to all nodes. Multi-path routing with load balancing.
Clusters in major cities connected via MQTT gateways and long-range relays. Border nodes bridge clusters. DNS-like cache resolves node positions.
Continental super-clusters connected via internet backbone (MQTT). Hierarchical geohash addressing. Cascading DNS cache for node discovery.
Click on nodes, links, or the internet gateway to simulate failures. Watch the network adapt in real-time.
A node dies (battery, hardware). Its routes break instantly. System 5 switches to cached backup routes in 0ms. If a border node dies, the second border node takes over. If all borders die, the cluster falls back to flooding.
GPS module fails or loses signal. The node can't compute its geohash. Fallback: neighbor consensus — if 4 of 5 neighbors say "u0x8", the node adopts that cluster. If no neighbors have GPS: "homeless" mode with local flooding.
Internet-based MQTT links between cities fail. The LoRa relay subnet activates — a chain of small relay nodes bridges clusters via pure radio. Slower (more hops) but functional. The green chain below the clusters is this backup path.
As the Network Health Score drops, low-priority traffic is automatically blocked. SOS (P0) always gets through, even at 1% network health. Firmware updates (P7) only when the network is perfect. The network breathes: less traffic under stress = self-healing.
Python simulation comparing 6 routers (Naive Flood, Managed Flood at 3/5/7 hops, Next-Hop, System 5) across 21 scenarios (20–1500 nodes) — with realistic Meshtastic hop limits revealing delivery collapse at scale.
| Scenario | Nodes | Naive TX | Managed TX | Next-Hop TX | Sys5 TX | S5 Delivery | S5 vs Managed |
|---|
Click button to toggle between log and linear scale
System 5 maintains high delivery even under failures
How much the most loaded node has to transmit — lower is better
High-priority traffic gets through even when the network is degraded
With realistic hop limits (3–7), managed flooding not only wastes bandwidth — it fails to deliver. At 500 nodes with hop limit 7, only 51% of messages arrive. At 1000 nodes, only 6%. System 5 delivers 7.5x more messages in the same scenarios, using fewer total transmissions per delivered message. The hop limit is not a safety net — it is the primary scaling barrier that makes large mesh networks fundamentally unreliable.
Quick reference for the technical terminology used throughout this document.
The hop limit doesn't just cap range — it prevents delivery. At 500+ nodes, managed flooding delivers as few as 2–14% of messages. System 5 breaks through this barrier with directed routing at ~1 TX per hop. Across 21 scenarios, System 5 dominates small-to-medium networks and delivers significantly more at scale — though very large sparse networks remain challenging for any protocol.
Complete analysis with problem statement, all five approaches evaluated, mathematical scoring, resilience design, QoS architecture, and the project roadmap.