跳转至

Conclusion

In this paper, we show that existing load-balancing schemes do not work for RDMA traffic because of the lack of sufficiently large flowlet gaps and RDMA’s performance degradation in the face of out-of-order packets. To tackle RDMA’s intolerance for out-of-order packets, we first design an in-network packet reordering scheme that resolves out-of-order packets before delivering them to an RDMA receiver. We then present ConWeave, a load balancer design that performs fine-grained load-balancing of RDMA flows such that the out-of-order packets could be reordered by the in-network reordering mechanism. Through software simulations and hardware testbed evaluations, we show that ConWeave consistently outperforms existing designs. By also highlighting the need for developing load balancing algorithms specifically designed for RDMA traffic, we believe that ConWeave opens up a new chapter on load balancing for RDMA in datacenter networks.

在本文中,我们展示了现有的负载均衡方案由于缺乏足够大的流片间隔以及在面对乱序数据包时RDMA性能的下降,无法有效处理RDMA流量。为了解决RDMA对乱序数据包的容忍度问题,我们首先设计了一种网络内数据包重排序方案,在将数据包交付给RDMA接收端之前解决乱序问题。然后,我们提出了ConWeave,一个负载均衡器设计,能够对RDMA流量进行精细粒度的负载均衡,以便网络内重排序机制能够对乱序数据包进行重排序。通过软件仿真和硬件测试平台评估,我们证明了ConWeave始终优于现有设计。通过强调需要开发专门针对RDMA流量设计的负载均衡算法,我们相信ConWeave为数据中心网络中的RDMA负载均衡开辟了一个新篇章。

我汇总一下:

Limitaion of Traditional Methods

Packet-level负载均衡

  • 会导致==大量的包乱序问题==,这对RDMA网络来说是致命的,因为RDMA对包乱序非常敏感
  • 包级别的负载分散虽然能够实现更细粒度的负载均衡,但乱序问题限制了其在RDMA网络中的应用

Flowlet负载均衡

  • 虽然能够避免包乱序,但 粒度较粗,无法充分利用多条路径进行负载均衡
  • 需要等待足够长的包间隔才能切换路径,这降低了负载均衡的及时性和效率

Innovation of ConWeave

核心特点

  • 结合了packet-level的细粒度负载均衡和flowlet的有序传输优势
  • 在网络中实现了包的重排序功能,从而能够支持更激进的负载均衡策略

主要优势

  • 允许在包级别进行负载分散,同时通过在网络中重排序来保证RDMA所需的包序[1]
  • 不需要等待较长的包间隔就能切换路径,提高了负载均衡的响应速度和效率[1]
  • 能够更好地适应RDMA网络的特点,在保证性能的同时实现更优的负载均衡效果[1]
My Opinion

ConWeave实际上没有在“粒度”层面作出特别的设计,它是在包级别的运输上做了文章,它在网络的 start 和 end 端引入了重排序机制。我们并不在乎packrt在网络拓扑中的顺序,只需要保证 recv端 能够按照顺序接收即可。