slides 97 iccrg bbr congestion control 02

2020-02-27 327浏览

1.BBR Congestion Control Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Soheil Hassas Yeganeh, Van Jacobson IETF 97: Seoul, Nov 2016 1
2.Congestion and bottlenecks 2
3.Congestion and bottlenecks Delivery rate BDP Amount in flight 3 BDP + BufSize
4.Delivery rate RTT BDP Amount in flight 4 BDP + BufSize
5.CUBIC / Reno RTT Delivery rate BDP Amount in flight 5 BDP + BufSize
6.Optimal:max BW and min RTT (Gail & Kleinrock. 1981) RTT Delivery rate BDP Amount in flight 6 BDP + BufSize
7.Estimating optimal point (max BW, min RTT) BDP = (max BW) * (min RTT) Est min RTT = windowed min of RTT samples RTT Delivery rate Est max BW = windowed max of BW samples BDP Amount in flight 7 BDP + BufSize
8.Delivery rate RTT But to see both max BW and min RTT, must probe on both sides of BDP... Only min RTT is visible Only max BW is visible BDP amount in flight 8 BDP + BufSize
9.One way to stay near (max BW, min RTT)point:Model network, update windowed max BW and min RTT estimates on each ACK Control sending based on the model, to... Probe both max BW and min RTT, to feed the model samples Pace near estimated BW, to reduce queues and loss [move queue to sender] Vary pacing rate to keep inflight near BDP (for full pipe but small queue) That's BBR congestioncontrol:BBR = Bottleneck Bandwidth and Round-trip propagation time BBR seeks high tput with small queue by probing BW and RTT sequentially 9
10.BBR:model-based walk toward max BW, min RTT optimal operating point Confidential + Proprietary 10
11.STARTUP:exponential BW search Confidential + Proprietary 11
12.DRAIN:drain the queue created during startup Confidential + Proprietary 12
13.PROBE_BW:explore max BW, drain queue, cruise Confidential + Proprietary 13
14.PROBE_RTT drains queue to refresh min_RTT Minimize packets in flight for max(0.2s, 1 round trip) after actively sending for 10s. Key for fairness among multiple BBR flows. Confidential + Proprietary 14
15.Performance results 15
16.RTT (ms) Data sent or ACKed (MBytes) STARTUP DRAIN BBR andCUBIC:Start-up behavior CUBIC (red) BBR (green) ACKs (blue) PROBE_BW 16
17.BBR multi-flow convergence dynamics bw = 100 Mbit/sec path RTT = 10ms Converge by sync'd PROBE_RTT + randomized cycling phases in PROBE_BW ● Queue (RTT) reduction is observed by every (active) flow ● Elephants yield more (multiplicative decrease) to let micegrow:each flow learns its fair share Confidential + Proprietary 17
18.Fully use bandwidth, despite high loss BBR vsCUBIC:synthetic bulk TCP test with 1 flow, bottleneck_bw 100Mbps, RTT 100ms 18
19.Low queue delay, despite bloated buffers BBR vsCUBIC:synthetic bulk TCP test with 8 flows, bottleneck_bw=128kbps, RTT=40ms 19
20.Active benchmarking tests on Google WAN ● BBR used for vast majority of TCP on Google B4 ● Active probes across metros ○ 8MB PRC every 30s over warmed connections ○ On the lowest QoS (BE1) BBR is 2-20x faster than Cubic ○ BBR tput is often limited by default maximum RWIN (8MB) ●WIP:benchmarking RPC latency impact of all apps using B4 with higher max. RWIN 20
21.Deep dives & implementation 21
22.Toppriority:reducing queue usage ● Current active work for BBR ●Motivation:○ Further reduce delay and packet loss ○ Better fairness w/ loss-based CC in shallow buffers ○ Better fairness w/ higher-RTT BBR flows ○ Lower tail latency for cross-traffic ●Mechanisms:○ Draining queue more often ■ Drain inflight down to BDP each gain cycle ○ Estimate available buffer; modulate probing magnitude/frequency ■ In shallow buffers, BBR bw probing makes loss-based CC back off 22
23.Sharing deep buffers with loss-based CC At first CUBIC/Reno gains an advantage by filling deep buffers But BBR does not collapse; itadapts:BBR's bw and RTT probing tends to drive system toward fairness Deep buffer datapoint:8*BDPcase:bw = 10Mbps, RTT = 40ms, buffer = 8 * BDP ->CUBIC:6.31 Mbps vsBBR:3.26 Mbps 23
24.Current dynamics w/ with loss-based CC CUBIC vs BBRgoodput:bw = 10Mbps, RTT = 40ms, 4 min. bulk xfer, varying buffer sizes 24
25.BBR multi-flowbehavior:RTT fairness Compare the goodput of two competing BBR flows with short (A) and long (B) min_RTT Flow B (varying min_RTTs, start t = 2 sec) Flow A (min_RTT=10ms, start t = 0 sec) min_RTT for flow B (ms) BBR flows w/ higher RTT have an advantage; but BBR flow with 64x higher min_RTT only has <4x higher bw bw = 10 Mbit/sec, buffer = 1000 packets Confidential + Proprietary 25
26.Common real-world issues ● ACK compression ○ One TCP ACK for up to +200 packets ○ Particularly wireless & cable networks ○ BBRstrategy:cap inflight <= 2*BDP ● Application idles ○ Paces at BW restarting from idle ● Inappropriate receive window ○ Linux default 3MB => 240Mbps on 100ms RTT ● Token-bucket traffic policers ○ Explicitly model policers ○ Details presented in maprg 26
27.Implementation and deployment status ● Linux v4.9 TCP ○ A congestion control module with dual GPL/BSD licence ○ Requires fq/pacing qdisc (BBR needs pacing support) ○ Employed for vast majority of traffic for Google's WAN. ○ Being deployed on Google.com and YouTube ● QUIC implementation under way ○ Production experiments have started ○ {vasilvv,ianswett,jri}@google.com ● FreeBSD implementation under way ○ rrs@netflix.com 27
28.BBR FAQ Is BBR fair to Cubic/Reno? Is BBR 1/sqrt(p)? Is BBR {delay loss ECN AIMD}-based? Is BBR ack-clocked? Does BBR require pacing? Does BBR require an FQ scheduler? Does BBR require receiver or network changes Does BBR improve latency on short flows? Buffer >= 1.5*BDP:Yes;Else:WIP No No. It is congestion-based No Yes No, but it helps No Yes 28
29.ConclusionBBR:'>BBR: