跳转至

A Practical Traffic Management System for Integrated LTE-WiFi Networks

Abstract

Mobile operators are leveraging WiFi to relieve the pressure posed on their networks by the surging bandwidth demand of applications. However, operators often lack intelligent mechanisms to control the way users access their WiFi networks. This lack of sophisticated control creates poor network utilization, which in turn degrades the quality of experience (QoE). To meet user traffic demands, it is evident that operators need solutions that optimally balance user traffic across cellular and WiFi networks. Motivated by the lack of practical solutions in this space, we design and implement ATOM- an end-to-end system for adaptive traffic offloading for WiFi-LTE deployments. ATOM has two novel components: (i) A network interface selection algorithm that maps user traffic across WiFi and LTE to optimize user QoE and (ii) an interface switching service that seamlessly re-directs ongoing user sessions in a cost-effective and standards-compatible manner. Our evaluations on a real LTE-WiFi testbed using YouTube traffic reveals that ATOM reduces video stalls by 3-4 times compared to naive solutions.

移动运营商正利用WiFi来缓解应用程序带宽需求激增对其网络造成的压力。然而,运营商通常缺乏智能机制来控制用户访问其WiFi网络的方式。这种精密控制的缺乏导致了网络利用率低下,进而降低了用户体验质量(QoE)。为满足用户流量需求,运营商显然需要能够优化蜂窝网络和WiFi网络间用户流量分配的解决方案。鉴于该领域缺乏实用性解决方案,我们设计并实现了一款名为ATOM的端到端系统,用于WiFi-LTE部署中的自适应流量分流。ATOM包含两个创新组件:(i)一种网络接口选择算法,该算法将用户流量映射至WiFi和LTE,以优化用户体验质量;以及(ii)一种接口切换服务,该服务以经济高效且符合标准的方式无缝重定向正在进行的用户会话。我们在真实的LTE-WiFi测试平台上使用YouTube流量进行的评估表明,与简单解决方案相比,ATOM能够将视频卡顿现象减少3-4倍。

TL;DR

核心冲突: wifi && lte 的网络分流不太行

  • 网络接口选择算法: 智能地将用户流量分配到WiFi或LTE,以最大化用户体验质量 (QoE)
  • 接口切换服务: 实现进行中的用户会话在WiFi和LTE间的无缝、低成本且标准兼容的切换

INTRODUCTION

Cellular networks are facing an unprecedented increase in data traffic due to the popularity of bandwidth-intensive mobile services. Although operators are continuously upgrading their networks to cope with such increase, the growth in network capacity is still considerably behind the bandwidth demand [1]. Hence, most operators around the world (e.g., China Mobile, AT&T) are aggressively deploying WLANs for additional capacity since WiFi is cheap and easy to deploy at scale [2, 3, 4]. However, sustaining good QoE in such heterogeneous deployments requires a much more sophisticated solution than simply deploying unmanaged WLANs. For next-generation mobile networks, a solution that carefully manages the network interface (e.g., WiFi vs. LTE) of user flows forms a critical component of network optimization. Although such solutions exist today, they suffer from the following limitations:

蜂窝网络正面临着前所未有的数据流量增长,这主要归因于带宽密集型移动服务的普及。尽管运营商在持续升级其网络以应对这种增长,但网络容量的增速仍远落后于带宽需求[1]。因此,全球大多数运营商(例如,中国移动、AT&T)正积极部署WLAN以获取额外容量,因为WiFi成本低廉且易于大规模部署[2, 3, 4]。然而,在此类异构部署中维持良好的用户体验质量(QoE)需要比简单部署非托管WLAN复杂得多的解决方案。对于下一代移动网络而言,一个能够精心管理用户流的网络接口(例如,WiFi与LTE)的解决方案构成了网络优化的关键组成部分。尽管目前存在此类解决方案,但它们存在以下局限性:

Drawbacks of Current Solutions: (i) Naive, static and coarse-grained policies: Operators rely on connection managers on user devices that are generally configured to select WiFi as the default interface when available [5]. Since WiFi APs are usually deployed in hot-spot areas to begin with, one can expect a large number of users to receive a strong signal from WiFi APs during peak periods. Hence such naive policies do not translate to higher user throughput, since the load of the WiFi AP is not accounted for in interface selection. In addition, most operators do not have the capability to switch the interface of a flow seamlessly (i.e., without breaking) across WiFi and cellular; the interface selection is thus decided only when initiating the connection. Hence, the selection is not adaptive to the dynamic conditions of wireless networks. Finally, the same level of throughput translates to different levels of QoE for a user depending on the application. Hence, loading all the application flows [6, 7] of a user on to the same interface does not translate to improved QoE for all the flows as the capacity of that interface has to be shared by multiple such flows from other users as well. (ii) Lack of practical solutions: While some studies [7, 8, 9, 10] have focused on interface selection, they only solve a part of the problem by simply providing algorithms for interface assignment assuming that a framework for seamless switching exists. In addition to the theoretical complexity of the problem, designing such a framework alone has several practical constraints and challenges that it must account for to deliver a readily deployable solution. While there exist some systems efforts that schedule user data across WiFi and cellular interfaces [11, 12], they are limited to delay-tolerant traffic and cannot support real-time applications such as video.

当前解决方案的缺陷:

(i) 简单、静态和粗粒度的策略:运营商依赖 用户设备上的连接管理器 ,这些管理器通 常被配置为在WiFi可用时将其选为默认接口 [5]。由于WiFi AP(接入点)最初通常部署在热点区域,因此可以预见在高峰时段大量用户会接收到来自WiFi AP的强信号。因此,这类简单策略并不能转化为更高的用户吞吐量,因为在接口选择中并未考虑WiFi AP的负载情况。此外,大多数运营商不具备在WiFi和蜂窝网络之间无缝(即不中断连接)切换流接口的能力;因此, 接口选择仅在发起连接时决定 。故而,这种选择 无法适应无线网络的动态条件 。最后,相同的吞吐量水平对于不同应用的用户可能转化为不同水平的用户体验质量。因此,将用户的所有应用流[6, 7]加载到同一接口上并不能为所有流带来改善的用户体验质量,因为该接口的容量还必须由来自其他用户的多个此类流共享。

(ii) 缺乏实用性解决方案:虽然一些研究[7, 8, 9, 10]专注于接口选择,但它们仅仅通过提供接口分配算法来解决部分问题,并假设存在无缝切换的框架。除了该问题的理论复杂性之外,仅设计这样一个框架就需要考虑若干实际约束和挑战,才能交付一个可直接部署的解决方案。虽然存在一些在WiFi和蜂窝接口间调度用户数据的系统性尝试[11, 12],但它们仅限于时延容忍型流量,无法支持如视频等实时应用。

Challenges: (i) Practicality: The framework must be light-weight, scalable and designed as an overlay solution over current LTE networks without requiring additional standards support. (ii) Adaptiveness: To sustain high QoE, the system must dynamically choose interfaces in order to adapt to flow arrivals, departures and changing wireless link conditions. (iii) Seamlessness: In the event of an interface (and thus IP address) change during an ongoing user session, the framework should not break the existing connections and should seamlessly migrate user flows between WiFi and LTE. (iv) Business interests: Seamless flow migration currently requires that all WiFi traffic gets backhauled to the LTE core network for proper IP anchoring, thereby significantly increasing the operational costs. Thus, it is challenging to provide a dynamic solution given the lack of incentive for operators to invest heavily on user QoE for OTT (over-the-top) traffic. Thus, the key challenge is to not just design a scalable, dynamic and seamless traffic management solution, but to also build an end-end system that can be easily deployed in any operator’s core network (i.e. being operator agnostic) without requiring tight data plane integration between WiFi and LTE.

挑战:

(i) 实用性:该框架必须是轻量级、可扩展的,并设计为当前LTE网络之上的覆盖层解决方案,无需额外的标准支持

(ii) 适应性:为维持高用户体验质量,系统必须动态选择接口,以适应流的到达、离开以及变化的无线链路条件

(iii) 无缝性:在进行中的用户会话期间发生接口(并因此导致IP地址)变更的情况下,该框架不应中断现有连接,并应能无缝地在WiFi和LTE之间迁移用户流

(iv) 商业利益:无缝流迁移目前要求所有WiFi流量回传至LTE核心网以实现正确的IP锚定,从而显著增加运营成本。鉴于运营商缺乏为OTT(Over-The-Top,互联网增值业务)流量的用户体验质量进行大量投资的动力,提供动态解决方案具有挑战性。(下文会做出解释)

因此,关键挑战不仅在于设计一个可扩展、动态且无缝的流量管理解决方案,还在于构建一个能够轻松部署在任何运营商核心网络中(即运营商无关性)且无需WiFi与LTE之间紧密数据平面集成的端到端系统。

所有WiFi流量回传至LTE核心网

"无缝流迁移目前要求所有WiFi流量回传至LTE核心网以实现正确的IP锚定,从而显著增加运营成本" 什么意思?

什么叫 "所有WiFi流量回传至LTE核心网"?为什么要这么做?

(1) 场景:用户先在LTE网络下观看视频,然后无缝切换到WiFi(WiFi流量回传至LTE核心网)

  • 普通家庭WiFi: 数据包离开WiFi AP后,通常直接通过你家的宽带调制解调器(光猫/DSL猫) -> 你的互联网服务提供商(ISP)-> 互联网 -> 视频服务器
  • WiFi流量回传至LTE核心网: 数据包离开WiFi AP后,不是直接走向公共互联网,而是通过一条特殊的“回传链路”被送回到移动运营商的LTE核心网络(特别是到一个叫做PGW/Packet Gateway的网元,它是IP锚点)。然后,数据才从LTE核心网流向互联网,再到视频服务器

初始状态:用户在LTE网络下观看 www.cdn.video.com

  • 用户设备 (UE): 打开视频App,请求 www.cdn.video.com 的视频
  • DNS解析: UE将域名 www.cdn.video.com 解析为CDN服务器的IP地址 (例如 CDN_IP)
  • 数据请求 (上行流量): UE -> LTE基站 (eNodeB) -> LTE核心网 (SGW -> PGW) -> 互联网 -> CDN_IP (视频服务器)
    • UE此时拥有一个由PGW分配的IP地址
  • 数据响应 (下行视频流量): CDN_IP (视频服务器) -> 互联网 -> LTE核心网 (PGW -> SGW) -> LTE基站 (eNodeB) -> UE
    • PGW是IP锚点,所有进出UE的数据都经过它

发生切换:用户进入WiFi覆盖范围,系统决定切换到WiFi

  • UE检测到可用的WiFi网络,并连接到WiFi接入点 (AP)
  • 为了实现无缝切换(IP地址不变),WiFi网络被配置为 将UE的流量通过回传链路发送到之前为该UE提供服务的同一个PGW

切换后状态:用户在WiFi网络下继续观看 www.cdn.video.com (流量回传至LTE核心网)

  • 用户设备 (UE): 继续请求 www.cdn.video.com 的下一个视频片段
  • 数据请求 (上行流量)
    • 通过WiFi回传: UE -> WiFi AP -> 回传链路 (Backhaul Link) -> LTE核心网 (到达同一个PGW) -> 互联网 -> CDN_IP (视频服务器)
    • UE仍然使用之前由PGW分配的同一个IP地址。WiFi AP将UE的流量封装或路由到PGW
  • 数据响应 (下行视频流量)
    • 通过WiFi回传: CDN_IP (视频服务器) -> 互联网 -> LTE核心网 (PGW) -> 回传链路 (Backhaul Link) -> WiFi AP -> UE
    • PGW接收到来自互联网的视频数据,因为它知道UE现在通过这个WiFi路径接入,所以将数据通过回传链路发送给对应的WiFi AP,再由AP通过无线方式发送给UE

(2) 那什么时候,才能真正用wifi自己的局域网IP地址进行网络资源访问呢?

这其实是一个概念误区❕

Wi-fFi 分成两类:

  1. 普通Wi-Fi: 会发生LTE-Wifi网间切换。切换后几秒钟内即可使用WiFi的局域网IP和其直接的互联网出口
  2. 与LTE深度交互的Wi-Fi: 上面提到的“WiFi流量回传LTE核心网”机制。在这种特殊机制下,即使设备连着WiFi,其互联网流量(或至少是受保护的会话流量)实际上还是通过LTE核心网的IP地址出去的,而不是WiFi路由器的直接出口

(3) 回传机制的一般要求

这种将WiFi流量回传至LTE核心网以实现IP锚定的机制,确实更常见于:

  1. WiFi和LTE由同一家移动运营商提供
  2. WiFi提供商与移动运营商有紧密的合作和整合关系
运营商缺乏为OTT(Over-The-Top,互联网增值业务)流量的用户体验质量进行大量投资的动力
  1. OTT流量:指的是像微信、抖音、YouTube、Netflix这类由第三方提供的、通过运营商网络传输的应用和服务。运营商主要提供网络管道,但这些应用本身的收入不直接归运营商

  2. 缺乏动力: 运营商的主要收入来自于套餐费、流量费等。虽然好的OTT体验能促进用户使用更多流量,但运营商为提升这些“别人家”应用的体验而投入巨额资金(比如上述的回传成本)去改造网络的意愿可能不那么强烈,因为直接回报不明显

To address these challenges, we design ATOM - a system that adaptively maps user flows to the appropriate network interface to improve user QoE. ATOM has two key components: (i) A fine-grained traffic management solution that uses a practical algorithm for interface selection to maximize the network-wide utility. (ii) A switching service that seamlessly changes the interface for certain user flows without the need for data plane integration, thereby reducing backhaul costs for the operators. We observe that certain characteristics of HTTP video streaming and browsing can be exploited to enable seamless re-direction of such flows, via HTTP proxies, to avoid backhauling these traffic types from WiFi to the LTE network. However, ATOM’s formulation is not limited to HTTP and also supports non-HTTP flows (although such flows would not benefit from the backhaul reduction). Nevertheless, we believe that ATOM offers important backhaul cost savings since most of the traffic in mobile networks is video streaming over HTTP [13].

为应对这些挑战,我们设计了 ATOM —— 一个自适应地将用户流映射到合适网络接口以改善用户体验质量的系统。ATOM包含两个关键组件:(i)一个细粒度的流量管理解决方案,该方案使用一种实用的 接口选择算法来最大化网络范围内的效用 。(ii) 一种切换服务,可 为特定用户流无缝更改接口 ,而无需数据平面集成,从而降低运营商的回程成本。我们观察到,可以利用HTTP视频流和浏览的某些特性,通过HTTP代理实现此类流的无缝重定向,以避免将这些流量类型从WiFi回传到LTE网络。然而,ATOM的构想不限于HTTP,也支持非HTTP流(尽管此类流无法从回程减少中受益)。尽管如此,我们相信ATOM能提供重要的回程成本节省,因为移动网络中的大部分流量是基于HTTP的视频流[13]。

We have prototyped and evaluated ATOM on a heterogeneous LTEWiFi testbed using real Web traffic. Our evaluations show that ATOM effectively reduces the video buffering periods for a user from an average of 8 to 2 per minute. We also evaluated the seamless interface switching functionality of ATOM with several Web video services. To the best of our knowledge, this is the first detailed design and implementation of a practical system that manages user traffic across LTE and WiFi networks. A noteworthy aspect of ATOM is that it is operator-agnostic and standards-compatible, and can hence be readily deployed for any operator looking to manage its LTE and WiFi networks efficiently. Our contributions are multi-fold: (i) We establish the hardness of the interface assignment problem and propose a greedy algorithm with performance guarantees under certain conditions. Our algorithm is scalable and practical to implement. (ii) We design and build an end-to-end dynamic traffic management system that seamlessly switches the interface for user flows and (iii) we conduct extensive evaluations using both prototype experiments and large-scale simulations.

我们已在一个异构LTE-WiFi测试平台上使用真实的Web流量对ATOM进行了原型设计和评估。我们的评估表明,ATOM有效地将用户的视频缓冲时长从平均每分钟8次减少到2次。我们还针对ATOM的无缝接口切换功能,使用了多个Web视频服务进行了评估。据我们所知,这是首个对跨LTE和WiFi网络管理用户流量的实用系统进行的详细设计和实现。ATOM的一个显著特点是其运营商无关性和标准兼容性,因此任何希望有效管理其LTE和WiFi网络的运营商都可以轻松部署。

我们的贡献是多方面的:(i)我们确定了接口分配问题的复杂性,并提出了一种在特定条件下具有性能保证的贪婪算法。我们的算法具有可扩展性且易于实际部署。(ii) 我们设计并构建了一个端到端的动态流量管理系统,能够为用户流无缝切换接口。(iii) 我们通过原型实验和大规模仿真进行了广泛的评估。