Revolutionizing Network Management with TCP BBR: A Comparative Analysis with CUBIC and Reno
TCP BBR (Bottleneck Bandwidth and Round-trip propagation time) has emerged as a groundbreaking congestion control algorithm, offering substantial improvements over its predecessors, such as CUBIC and Reno. This blog post delves into the workings of TCP BBR, its comparative advantages, and its application in enhancing the manageability of MikroTik routers, particularly within environments such as those deployed by MikroCloud.
Understanding TCP BBR
Developed by Google, TCP BBR aims to optimize the throughput and latency of internet traffic by accurately estimating the network's bottleneck bandwidth and round-trip time. Unlike traditional congestion control algorithms like CUBIC and Reno, which rely on packet loss as an indicator of congestion, BBR focuses on maximizing bandwidth utilization while minimizing latency, making it exceptionally well-suited for today's high-speed, unstable connection environments.
TCP BBR vs. CUBIC and Reno
Traditional Congestion Control
CUBIC and Reno, the more traditional TCP congestion control algorithms, increase the data transmission rate until packet loss occurs, indicating congestion. They then reduce the transmission rate in response. This approach, while effective in simpler network conditions, often leads to underutilization of available bandwidth, especially in networks where packet loss does not necessarily signal congestion, such as those with high random packet loss rates.
BBR's Approach
TCP BBR, on the other hand, proactively measures the network's actual throughput capacity and round-trip time, allowing it to adjust its data transmission rate more accurately. By doing so, BBR can maintain higher throughput rates and lower latency, even in the presence of packet loss unrelated to congestion.
MikroCloud and TCP BBR
MikroCloud utilizes TCP-based tunnels for connecting routers to management servers. These servers, running TCP BBR, significantly improve the manageability and reliability of connections, especially in scenarios prone to unstable network conditions. The beauty of TCP BBR is that only the server side needs to implement it for clients (e.g., MikroTik devices) to benefit from its advanced congestion control capabilities.
TCP BBR in Action: A Practical Example
Consider a scenario where a network experiences a 1.5% packet loss rate. Traditional algorithms like CUBIC or Reno might drastically reduce throughput, interpreting this loss as a sign of congestion. However, a server employing TCP BBR would better differentiate between congestion-related packet loss and other types of packet loss, maintaining higher throughput levels.
Real-World Testing and Results
Testing BBR's effectiveness involved comparing its performance with the Cubic algorithm under conditions of added latency and packet loss. The introduction of a 140ms round-trip delay, simulating the latency between San Francisco and Amsterdam, and a minor packet loss of 1.5%, highlighted BBR's superiority. While Cubic's throughput dramatically dropped in the face of packet loss, BBR maintained a more reliable throughput level, demonstrating its resilience and efficiency.
tc qdisc replace dev enp0s20f0 root netem latency 70ms
If we do a quick ping, we can now see the 140ms round trip time
hannes@compute-000:~# ping 154.66.114.121
PING 154.66.114.121 (154.66.114.121) 56(84) bytes of data.
64 bytes from 154.66.114.121: icmp_seq=1 ttl=61 time=140 ms
64 bytes from 154.66.114.121: icmp_seq=2 ttl=61 time=140 ms
64 bytes from 154.66.114.121: icmp_seq=3 ttl=61 time=140 ms
Ok, time for our first tests, we are going to use Cubic to start, as that is the most common TCP congestion control algorithm used today.
sysctl -w net.ipv4.tcp_congestion_control=cubic
A 30 second iperf shows an average transfer speed of 347Mbs
. This is the first clue of the effect of latency on TCP throughput.
The only thing that changed from our initial test (2.35Gbs
) is the introduction of 140ms round trip delay. Let's now set the congestion control algorithm to bbr and test again.
sysctl -w net.ipv4.tcp_congestion_control=bbr
The result is very similar, the 30seconds average now is 340Mbs
, slightly lower than with Cubic. So far no real changes.
The effect of packet loss on throughput
We're going to repeat the same test as above, but with the addition of a minor amount of packet loss. With the command below, we're introducing 1,5%
packet loss on the server (sender) side only.
tc qdisc replace dev enp0s20f0 root netem loss 1.5% latency 70ms
The first test with Cubic shows a dramatic drop in throughput; the throughput drops from 347Mb/s
to 1.23 Mb/s
. That's a ~99.5%
drop and results in this link basically being unusable for today's bandwidth needs.
If we repeat the exact same test with BBR we see a significant improvement over Cubic. With BBR the throughput drops to 153Mbps
, which is only a 55%
drop.
The tests above show the effect of packet loss and latency on TCP throughput.
The impact of just a minor amount (1,5%
) of packet loss on a long latency path is dramatic.
Using anything other than BBR on these longer paths will cause significant issues when there is even a minor amount of packet loss.
Only BBR maintains a decent throughput number at anything more than 1,5%
loss.
Conclusion
TCP BBR represents a significant leap forward in congestion control technology, offering enhanced throughput and reduced latency compared to traditional algorithms like CUBIC and Reno. Its ability to accurately assess network conditions and adjust transmission rates accordingly makes it particularly advantageous for managing TCP-based tunnels to servers, as seen with MikroCloud's deployment. By employing servers running TCP BBR, network administrators can ensure more stable and efficient management connections, even over inherently unreliable links. This advancement underscores the importance of adopting modern congestion control mechanisms to meet the demands of contemporary network environments.
Was this page helpful?