Research & Algorithm Design

MeshRoute

Finding the optimal communication path between LoRa mesh devices — before a single byte of payload is transmitted.

LoRa / Meshtastic Multi-Path Load Balanced Geo-Clustered

by Clemens Simon

00 — TL;DR

Current Meshtastic flooding wastes 92–100% of bandwidth in small-medium networks and fails to deliver at scale. System 5 combines geo-clustering, multi-path routing, and adaptive QoS into one self-healing protocol. For networks up to ~200 nodes: 100% delivery with 92–99.9% less bandwidth. At 500+ nodes: higher delivery than managed flooding (76% vs 51% at 500 nodes), though large sparse networks remain challenging for any protocol. The hop limit — today's biggest scaling bottleneck — becomes irrelevant: each hop costs ~1 transmission instead of n.

100%Del. ≤200
99.9%BW Saved
~1TX/Hop
Hop Limit
21Scenarios
01 — The Problem

Why Current Mesh Routing Fails

Meshtastic uses naive flooding: every node rebroadcasts every packet. This works for tiny networks but collapses at scale.

📡

Blind Flooding

Every message is rebroadcast by every node that receives it. A single message to one recipient generates n transmissions across the entire network.

LoRa Constraints

1–50 kbps bandwidth. 1% duty cycle (EU law). Half-duplex radio. Each packet takes 50ms–2s airtime. Budget: ~36–720 packets/hour/node.

🔋

Energy Waste

Every node transmits on every message — even nodes nowhere near the intended path. Battery-powered devices drain in hours instead of weeks.

💥

Collision Chaos

Multiple nodes rebroadcast simultaneously. LoRa is half-duplex — collisions destroy packets, triggering more retransmissions. A vicious cycle.

🛑

The Hop Limit Wall

Meshtastic caps hops at 3–7 to prevent flood storms. But this kills range: a message can't reach nodes beyond the limit. Every extra hop multiplies transmissions by n — so the limit can't be raised without drowning the network.

02 — Routing Approaches

State of the Art vs. New Proposal

Same network topology, four different routing strategies. Watch the TX counter — it tells the whole story.

STATE OF THE ART — Meshtastic Today
BASELINE

Naive Flooding

Theoretical worst case — every node rebroadcasts everything. Not actually used by Meshtastic, but useful as a reference.

Live — every node lights up red

How It Works

Source sends a packet. Every receiving node rebroadcasts it once. The entire network participates in every message.

Cost Per Message

TX = n (one per node)

+ Strengths
  • Maximum reliability
  • Zero setup
− Weaknesses
  • O(n) TX per message
  • All batteries drain equally
  • Collapses at ~40 nodes
MESHTASTIC CURRENT

Managed Flooding

The actual Meshtastic approach. SNR-based suppression: distant nodes rebroadcast first, close nodes suppress. ROUTER nodes always rebroadcast.

Live — gray nodes = suppressed (saved a TX)

How It Works

Before rebroadcasting, each node listens briefly. If it hears another node already rebroadcast, it suppresses its own transmission. Distant nodes (low SNR) get shorter delays and rebroadcast first. Close nodes wait and often suppress.

ROUTER-role nodes (marked R) override suppression — they always rebroadcast to ensure backbone coverage.

Cost Per Message

TX ≈ 0.4n – 0.6n (~50% suppression)

+ Strengths
  • ~50% less TX than naive
  • Self-organizing
  • Proven in 100+ node meshes
− Weaknesses
  • Still O(n) per message
  • Hop limit still needed
  • No path intelligence
MESHTASTIC v2.6

Next-Hop Routing

New in v2.6 — for direct messages only. Learns the relay node, then sends only via that one node. Falls back to flooding if it fails.

Live — watch the 3 phases: learn, direct, fallback

How It Works

Phase 1: First message uses managed flooding. The system tracks which node successfully relayed.

Phase 2: Subsequent messages go only via the learned next-hop node (marked NH). One relay instead of the whole network.

Phase 3: If the next-hop dies, the system falls back to managed flooding and learns a new relay.

Limitations

Only works for direct messages (unicast). Broadcasts still use managed flooding. Only learns one hop — not a full path.

+ Strengths
  • Huge TX reduction for DMs
  • Graceful fallback
  • Backward compatible
− Weaknesses
  • Broadcasts unchanged
  • Single relay, not full path
  • No load balancing
NEW PROPOSAL — System 5
PROPOSED

System 5 — Adaptive Load-Balanced Mesh

Geo-clustering + multi-path + weighted load balancing + adaptive QoS + self-healing. ~1 TX per hop for all message types.

Live — packets follow weighted routes, load bars adapt

Weight Function

W(r) = α·Q(r) × β·(1−Load(r)) × γ·Batt(r)

Traffic Distribution

Share(r) = W(r) / Σ W(all)

Key Properties

  • Traffic flows proportionally — good paths get more, not all
  • Back-pressure: overloaded nodes shed traffic automatically
  • Battery-aware: low-power nodes get fewer packets
  • Pheromone decay: unused paths fade, successful paths strengthen
  • Works for all message types — unicast and broadcast
  • ~1 TX per hop — no hop limit needed

vs. Managed Flooding

Managed flooding suppresses ~50% of rebroadcasts but still scales as O(n). System 5 routes along specific paths — cost scales with hop count, not network size. At 100 nodes: managed flood = ~1,500 TX per message, System 5 = ~2 TX.

vs. Next-Hop Routing

Next-hop learns a single relay node for direct messages. System 5 maintains 2-3 full paths with weighted load distribution for all traffic types. When a path fails, the next cached path activates instantly — no flooding fallback needed.

+ Strengths
  • ~1 TX/hop (not ~n)
  • No hop limit needed
  • Self-optimizing load balance
  • Works for all traffic types
− Complexity
  • Most complex to implement
  • Requires GPS for geo-clustering
  • Tuning parameters (α,β,γ)
03 — Mathematical Analysis

Quantitative Comparison

Formal evaluation of all four routing approaches across seven weighted criteria.

Naive Flooding — TX Cost Per Message

TXnaive = n

Every node rebroadcasts once. At n=100: 100 transmissions per message. Cost grows linearly with network size.

Managed Flooding — SNR-Based Suppression

TXmanaged = n × (1 S)   where S ≈ 0.4–0.6

S = suppression rate (fraction of nodes that hear a rebroadcast and stay silent). Depends on density and SNR distribution. At n=100 with S=0.5: ~50 transmissions. Still O(n) but ~50% cheaper than naive.

Next-Hop Routing — Learn & Cache (DMs Only)

TXfirst = n × (1 S)    TXcached = d

First message floods (managed). After learning: d = hop count to destination via cached relay. Amortized cost depends on cache hit rate. Broadcasts still use managed flooding.

System 5 — Multi-Path Directed Routing

TXsys5 = d   (hop count, always)

Every message — unicast and broadcast — follows a pre-computed path. Cost = hop count, independent of network size. At n=100, d=2: 2 transmissions. With fallback: scoped cluster flooding adds O(cluster_size) in worst case.

Route Weight Function — System 5 Load Balancing

W(r) = α · Q(r) + β · (1 Load(r)) + γ · Batt(r)

Q = link quality (OGM reception rate), Load = queue pressure, Batt = min battery along route.
Traffic share: Share(r) = W(r) / Σ W(all). Tuning: α=0.4, β=0.35, γ=0.25

SCORING MATRIX (0–10, WEIGHTED)

Criterion (Weight) Naive Flood Managed Flood Next-Hop System 5
TX Cost per Message (20%)14510
Delivery Reliability (20%)9989
Scalability (15%)1349
Fault Tolerance (15%)8879
Hop Limit Freedom (10%)12310
Energy Efficiency (10%)1358
Broadcast Support (10%)101039
WEIGHTED TOTAL 4.3 5.5 5.1 9.2

FINAL SCORES

4.3
Naive Flood
5.5
Managed Flood
5.1
Next-Hop
9.2
System 5
04 — System 5 Architecture

What System 5 Adds Beyond Managed Flooding

Meshtastic's managed flooding is clever — but still O(n). System 5 borrows six proven concepts from networking research to achieve O(hops).

from Internet Routing

OSPF Areas Geo-Clusters

Nodes self-organize by geohash prefix. Full topology within cluster, summarized routes between. Scales from 10 to 10,000+ nodes.

from Freifunk / B.A.T.M.A.N.

OGM Counting Quality Metric

Periodic originator messages. Count reception rate per neighbor. No complex calculation — just count how many arrive.

from Data Centers

Weighted ECMP Load Balancing

Traffic distributed proportionally to route weight. Good paths get more traffic, but never all. No single bottleneck node.

from Network Theory

Back-Pressure Congestion Control

Overloaded nodes report queue pressure. Traffic naturally avoids congested paths — like water flowing around rocks.

from Ant Colony Optimization

Pheromone Decay Self-Optimization

Successful deliveries strengthen a route. Timeouts weaken it. Unused routes fade naturally. The network learns.

from DNS

Hierarchical Cache Node Discovery

Where is the target node? Ask locally first, then cluster, then region. Answers are cached. Scoped flooding only as last resort.

05 — The Killer Argument

Why Hop Limits Become Irrelevant

In flooding, every hop multiplies transmissions across the entire network. In System 5, every hop costs exactly one transmission. This single change unlocks everything.

Flooding: Cost per Hop = n

TXtotal = n × hops

Every node rebroadcasts at every hop. At 100 nodes and 5 hops, a single message generates 330,000+ transmissions. The hop limit (default 3) is a survival mechanism — without it, the network drowns.

Hop 1: 100 TX
Hop 2: 100 TX
Hop 3: 100 TX
Hop 4: BLOCKED (hop limit)

System 5: Cost per Hop = 1

TXtotal = hops

Only the forwarding node transmits. At 100 nodes and 5 hops, a single message generates 5 transmissions. No hop limit needed — 20 hops cost the same as flooding costs for 1.

Hop 1: 1 TX
Hop 2: 1 TX
Hop 3: 1 TX
Hop 4: 1 TX
Hop 5: 1 TX
...
Hop 20: 1 TX

Verified by Simulation (Realistic Hop Limits)

Scenario Nodes 3-hop Del. 5-hop Del. 7-hop Del. Sys5 Del. Sys5 TX
Small Local 20 100% 100% 100% 100% 115
Medium City 100 92% 100% 100% 100% 402
Large Regional 500 14% 31% 51% 76% 412k
1000 Nodes 1000 2% 6% 6% 43% 182k*
1500 Nodes 1500 2% 4% 5% 42% 197k*

* At 1000+ nodes, System 5 uses more total TX than managed flooding (182k vs 78k) — but delivers 7x more messages. The per-delivered-message cost (TX/delivered) still favors System 5: ~4,229 vs ~13,065.

Critical finding: With Meshtastic's realistic hop limit (3–7), managed flooding's delivery rate collapses at scale. At 1000 nodes, only 6% of messages arrive — regardless of hop limit. System 5 delivers 7x more messages in the same scenario. The hop limit doesn't just cap range — it makes large networks fundamentally unreliable.

📶

Unlimited Range

No more artificial hop limits. Messages can traverse 20, 30, or 50 hops at the same per-hop cost. The network's range is limited only by node density, not by protocol constraints.

🔋

Battery Independence

Only the forwarding node transmits per hop — not every node in range. Nodes far from the path sleep through. Battery life increases from hours to weeks.

📡

Preset Freedom

With cheap hops, SHORT_FAST with more hops works as well as LONG_SLOW with fewer hops — at higher data rates. Choose the preset for your local conditions, not for the network's range limit.

06 — Key Metrics

System 5 by the Numbers

0%
Delivery Reliability
0%
Bandwidth Reduction vs Flood
0ms
Failover Time
0B
Piggyback Overhead
0
Routing Table Entries (n=1000)
0/10
Weighted Score
07 — Network Formation

How the Network Builds Itself

Watch step by step how nodes discover each other, form clusters, elect border nodes, and establish multi-path routes.

Step 0 / 9

Press "Next Step" to begin

This animation shows how a System 5 mesh network self-organizes from powered-on nodes to a fully routed, load-balanced mesh.

08 — Real-World Scale

From Neighborhood to Worldwide

Three scenarios showing how System 5 adapts across vastly different scales. Watch the node activity logs on the right.

LOCAL

Neighborhood Mesh — Munich District

12 nodes within ~3km. Direct LoRa links. Single geo-cluster. Full internal topology known to all nodes. Multi-path routing with load balancing.

Active node
Packet (primary)
Packet (secondary)
OGM beacon
Back-pressure
Node Activity Log
Waiting for simulation...
Nodes: 12
Range: ~3 km
Clusters: 1
Avg hops: 2.1
Latency: ~200ms
CONTINENTAL

Europe-Wide — Cross-Cluster Routing

Clusters in major cities connected via MQTT gateways and long-range relays. Border nodes bridge clusters. DNS-like cache resolves node positions.

Cluster (city)
Gateway
Data packet
DNS query
MQTT bridge
Cross-Cluster Protocol
Waiting for simulation...
Clusters: 8
Total nodes: ~2400
Bridge: MQTT+LoRa
Cluster hops: 3-5
Latency: ~2-8s
WORLDWIDE

Global Mesh — Continent-Spanning Network

Continental super-clusters connected via internet backbone (MQTT). Hierarchical geohash addressing. Cascading DNS cache for node discovery.

Super-cluster
Regional gateway
Data packet
DNS cascade
Internet backbone
Global Routing Protocol
Waiting for simulation...
Continents: 5
Super-clusters: 24
Total nodes: ~50,000
Max hops: 12-18
Latency: 5-30s
09 — Resilience & Adaptive QoS

When Things Go Wrong

Click on nodes, links, or the internet gateway to simulate failures. Watch the network adapt in real-time.

Node Failure

A node dies (battery, hardware). Its routes break instantly. System 5 switches to cached backup routes in 0ms. If a border node dies, the second border node takes over. If all borders die, the cluster falls back to flooding.

📡

GPS Failure

GPS module fails or loses signal. The node can't compute its geohash. Fallback: neighbor consensus — if 4 of 5 neighbors say "u0x8", the node adopts that cluster. If no neighbors have GPS: "homeless" mode with local flooding.

🌐

MQTT Bridge Down

Internet-based MQTT links between cities fail. The LoRa relay subnet activates — a chain of small relay nodes bridges clusters via pure radio. Slower (more hops) but functional. The green chain below the clusters is this backup path.

Adaptive QoS

As the Network Health Score drops, low-priority traffic is automatically blocked. SOS (P0) always gets through, even at 1% network health. Firmware updates (P7) only when the network is perfect. The network breathes: less traffic under stress = self-healing.

Click nodes or links to toggle failure • Dashed yellow = MQTT • Dashed green = LoRa relay chain
Local NHS (per Cluster)
1.00
GREEN
Worst cluster shown on gauge
QoS Priority Gate
Failure Presets
10 — Simulation Results

Real Numbers from the Simulator

Python simulation comparing 6 routers (Naive Flood, Managed Flood at 3/5/7 hops, Next-Hop, System 5) across 21 scenarios (20–1500 nodes) — with realistic Meshtastic hop limits revealing delivery collapse at scale.

Scenario Nodes Naive TX Managed TX Next-Hop TX Sys5 TX S5 Delivery S5 vs Managed

Transmissions: Flooding vs System 5

Click button to toggle between log and linear scale

Delivery Rate Under Stress

System 5 maintains high delivery even under failures

Busiest Node (Max TX Load)

How much the most loaded node has to transmit — lower is better

QoS Priority Gate (Stress Test)

High-priority traffic gets through even when the network is degraded

With realistic hop limits (3–7), managed flooding not only wastes bandwidth — it fails to deliver. At 500 nodes with hop limit 7, only 51% of messages arrive. At 1000 nodes, only 6%. System 5 delivers 7.5x more messages in the same scenarios, using fewer total transmissions per delivered message. The hop limit is not a safety net — it is the primary scaling barrier that makes large mesh networks fundamentally unreliable.

11 — Glossary

Key Terms Explained

Quick reference for the technical terminology used throughout this document.

LoRa
Long Range radio modulation technique enabling low-power, long-distance communication (1-15 km).
Meshtastic
Open source LoRa mesh firmware for off-grid communication on ESP32 devices.
Flooding
Broadcasting packets to all nodes in the network — simple but wasteful at scale.
Geohash
Hierarchical geographic coordinate encoding that maps locations to short alphanumeric strings.
OSPF
Open Shortest Path First — widely used internet routing protocol with area-based hierarchy.
B.A.T.M.A.N.
Better Approach To Mobile Ad-hoc Networking — mesh protocol used by Freifunk community networks.
ECMP
Equal-Cost Multi-Path routing — distributing traffic across multiple routes of similar quality.
OGM
Originator Message — periodic beacon in B.A.T.M.A.N. used to discover and rank neighbors.
NHS
Network Health Score — our composite metric combining connectivity, load, and battery state.
QoS
Quality of Service — traffic prioritization ensuring critical messages (SOS) always get through.
MQTT
Message queue protocol used as an internet bridge between geographically separated mesh clusters.
DTN
Delay Tolerant Networking — store-and-forward approach for networks with intermittent connectivity.
BFS
Breadth-First Search — graph traversal algorithm used for shortest-path discovery in mesh networks.
RSSI
Received Signal Strength Indicator — measures how strongly a radio signal is received (in dBm).
SNR
Signal-to-Noise Ratio — measures signal quality relative to background noise (in dB).
12 — Conclusion

The Path Forward

The hop limit doesn't just cap range — it prevents delivery. At 500+ nodes, managed flooding delivers as few as 2–14% of messages. System 5 breaks through this barrier with directed routing at ~1 TX per hop. Across 21 scenarios, System 5 dominates small-to-medium networks and delivers significantly more at scale — though very large sparse networks remain challenging for any protocol.

100%
Delivery ≤200n
99.9%
BW Saved (best)
~1
TX per Hop
21
Scenarios

Read the Full Executive Summary

Complete analysis with problem statement, all five approaches evaluated, mathematical scoring, resilience design, QoS architecture, and the project roadmap.