NETWORK INTERFACE ASSIGNMENT¶

TL;DR

主题: 如何给每个应用分配对应的LTE/Wi-Fi接口

这一部分原论文讲的实在是太“又臭又长”...

我来开个缩略版 :))

建模¶

Network Interface Assignment - NIA 组件

目的: 帮每个App的数据流找到最佳的网络路径（LTE还是某个WiFi），让整体网络体验（QoS）最好，并且确保不同的App能得到差异化的服务（比如视频优先）

(1) 网络模型 (Network Model)

区域作战: 它不是一下子看整个网络，而是聚焦在一个LTE基站所覆盖的区域。这个区域里可能有好几个WiFi热点
宏观调控: NIA的控制粒度不是那种"per packet/second"的（这是LTE基站和WiFi AP自己的事），它看得更长远一点，比如接下来几秒钟 (T) 内，大的数据流应该走哪条路，以保证这段时间内的平均速度（吞吐量）最优
目标函数: 它有个“目标公式”
- 人话: “请你找到一种分配方案 (xij，即把哪个App的流量i分配给哪个网络j)，能让所有App的‘满意度’总和 (U(tij)，即流量在网络上的效用或体验)达到最大”
- 限制条件: 每个App的流量最终只能走一条路（∑xij=1），不能脚踏两只船
虚拟技术: 就算你的手机只连了一个WiFi，NIA也能借助很多WiFi网卡的“分身术”（虚拟化技术），让不同的App感觉像是走了不同的虚拟WiFi通道

(2) 吞吐量和公平性模型 (Throughput and Fairness Models)

LTE的“按劳分配+优先照顾”: 优先级越高, 分配得到的带宽越多
传统WiFi的“排队平均分”: 排队轮询

(3) 效用函数的选择 (Choice of Utility Function)

边际效用递减: 直观理解, 当网速从很慢提升一点点时，你的“幸福感”会飙升；但当网速已经很快时，再提升一点点，你增加的“幸福感”就没那么明显了
权重调控: 通过给不同的App或用户设置不同的“重要性系数” (\(w_i\))，来调整它们对总“满意度”的贡献

(4) 定价模型 (Pricing Model)

一言以蔽之: \(“净满意度” = “用着爽的程度” - “花费的成本”\)

算法设计¶

鉴于网络接口分配问题的NP-Hard特性，网络接口分配（NIA）模块采用了一种实用且高效的贪婪算法来确定用户流的最佳网络接口。该算法主要分以下步骤执行：

算法流程

(1) 初始化与输入:

算法接收当前活跃用户流数量 (N)、LTE小区内WiFi AP数量 (B)、仅能由LTE覆盖的用户流子集 (S0) 以及可由WiFi AP覆盖的用户流子集 (SL) 作为输入。初始时，所有仅能由LTE覆盖的流 (S0) 被分配至LTE网络 (构成初始集合 A0)。

(2) 外层迭代循环 (针对WiFi AP):

算法迭代处理每个候选WiFi AP（存在于待处理AP集合 AB 中）

在每次迭代中，针对一个特定的WiFi AP，算法会探索和评估将相关用户流分配给此WiFi AP或LTE网络的各种组合，目标是最大化由此WiFi AP和LTE协同产生的总效用。当确定了针对当前WiFi AP的最优用户流分配方案后，该分配方案即被最终确定。

随后，此WiFi AP从待处理集合 AB 中移除，已分配接口的用户流亦从其他可能覆盖这些流的WiFi AP的待分配列表 (Lj) 中移除，以避免重复分配。

分配至LTE的流则更新至集合 A0。此过程重复进行，直至所有WiFi AP均完成用户流的分配。

(3) 内层迭代循环（针对特定LTE-WiFi AP对）

在为特定LTE基站与单个WiFi AP进行流量分配决策时，算法首先将所有候选用户流（即当前WiFi AP j 可覆盖的未分配流 Lj 以及已在 A0 中的流）暂定分配给LTE网络。

随后，算法以最大化边际效用增量为原则, 逐个考察是否将用户流从LTE网络迁移至当前WiFi AP。

若迁移某个流能够带来整体效用的正向增长，则执行该迁移。此贪婪迁移过程持续进行，直到没有用户流的迁移能够进一步提升总效用为止。

此时，针对该LTE-WiFi AP对的分配方案及其对应的总效用被记录。

Tip

逐个锁定最优Wi-Fi AP：外层循环一轮一轮地进行，每一轮都选出一个当前能带来最大整体效益的Wi-Fi AP ，并确定分配给它的用户流
局部优化 (Wi-Fi AP vs. LTE)：内层循环针对当前考虑的某个Wi-Fi AP，通过不断地从LTE“吸引”能提升总效用的用户流到这个Wi-Fi AP上，来进行局部优化
贪婪选择：在吸引用户流时，总是选择那个能使“满意度”增益最大的流
避免冲突和更新状态：一旦一个Wi-Fi AP的分配方案确定，它和它承载的流就不再参与后续的分配决策，并且会更新其他Wi-Fi AP的候选流列表以及LTE的承载流列表

性能保证

尽管由于问题的内在复杂性，该贪婪算法无法在所有理论上的最坏情况下提供最优解保证，但大量的仿真评估结果表明，其在平均情况下的性能表现是令人信服且有效的。该算法的时间复杂度为 \(O(B^2N)\)，其中 B 代表WiFi AP的数量，N 代表用户流的数量，说明它在实际部署中具有良好的可行性。

This is the component that manages all user ﬂows that belong to a given LTE cell. Speciﬁcally, it takes as input the signal strength of every user to its potential set of WiFi APs and the LTE basestation, relative QoS priority (or weights) and the current network interface of each user ﬂow. It then computes the appropriate network interface (i.e., a speciﬁc WiFi AP or the LTE basestation) for each user ﬂow. In this section, we formulate the network assignment as a utility optimization problem with a per-ﬂow utility function to ensure differentiated QoS across applications.

Network Model: ATOM operates at the level of a LTE cell where one or more WiFi APs are deployed within the coverage of that cell as in Figure 3. ATOM also handles scenarios where the coverage of several WiFi APs overlap, resulting in certain users having the option to connect to multiple WiFi APs. Hence, NIA computes the specific WiFi AP or LTE basestation that is used by each user flow. Since NIA operates at coarse time-scales (\(T\) is in orders of seconds), it leaves the fine-grained packet scheduling function to be performed by the LTE basestation and the WiFi APs locally. To allow this decoupling, the throughput is modeled as the average throughput of the client over the time \(T\) based on the scheduling policy. The problem can be formulated as:

\(x^* = \text{arg max} \sum_{j=0}^{B} \sum_{i} x_{ij} U(t_{ij})\) (1)

subject to \(\sum_{j=0}^{B} x_{ij} = 1, \forall i\)

where \(B\) is the total number of WiFi APs within the coverage of the LTE basestation (represented by \(j=0\)). The indicator variable \(x_{ij} = \{x_{i,j}, \forall i,j\}\) denotes the association vector for user flows (i.e. \(x_{ij} = 1\) if flow \(i\) is assigned AP \(j\)). \(u_{ij}\) is the average throughput estimated for flow \(i\) when associated with the AP \(j\). The constraint ensures that exactly one WiFi AP or LTE basestation is chosen for a user flow. Different flows of a user are allowed to pick potentially different WiFi APs. In practice, this can be realized using the virtualization capability found in most WiFi cards to create virtual WiFi networks that can run on a single WiFi physical interface [16]. The challenge in solving the above optimization lies in the utility (and throughput) function that couples the decisions of user flows assigned to the same interface.

Throughput and Fairness Models: LTE and WiFi have different MAC protocols with potentially distinct fairness (bandwidth sharing) policies that directly affects the throughput of the user flows.

LTE eNodeBs typically employ proportional fair scheduling. They also schedule the resources to the user flows in proportion to a weight that defines the relative priorities of the flows. In this case, the throughput of a user can be shown to depend on the total number of the other users and their relative weights as follows.

\(t_i = \frac{w_i r_{ij}}{\sum_{k \in N_j, k \neq i} w_k + w_i}, \forall i \in N_j\) (2)

where \(w_i\) is the weight for user flow \(i\); \(r_{ij}\) is the average link-layer rate (or the PHY rate) of user \(i\) on AP \(j\) (the eNodeB in this case) depending on the average signal-to-noise ratio (SNR) of the user on that AP and \(N_j\) is the total number of active users on AP \(j\).

On the other hand, WLANs, when operated distributively, typically use a throughput-based fairness model. Here, all the users served by the same AP get the same throughput at steady state. This is because the APs implement a round-robin scheduling scheme for the downlink packets. In this case, the average downlink throughput of a WiFi user can be expressed as

\(t_{ij} = \frac{L}{N_j}, \forall i \in N_j\) (3)

where \(L\) is the average size of a packet in bits.

However, when the operator controls both the LTE and WiFi networks, then it is possible to instrument a uniform fairness policy (say proportional fairness) across both these networks. In this case, the throughput of WiFi users would follow a throughput model similar to that for LTE. Also, we assume that interference between neighboring LTE cells and WiFi APs is taken care of through their respective interference management algorithms (frequency reuse in LTE and channel selection in WiFi), so as to not affect the throughput models.

Choice of Utility Function: While the design of NIA would work with concave utility functions in general, we incorporate the logarithmic function as the utility function for all user flows. Although applications might have diverse requirements for QoS and users will want to provide differentiated services to different application flows, NIA employs the log utility function generally to all application flows since: (a) it ensures a simple system design (b) recent advancements in end-to-end adaptation by applications (for instance adaptive video streaming [17]) allows most application traffic as elastic. Such a function ensures that the marginal utility of a flow decreases as the throughput increases. (c) log functions are extensively used as the utility function for resource management in wireless networks [18, 19]. Hence, NIA defines the utility function for every user flow as the product of the weight of the flow and the log of the average throughput obtained by the flow. Operators can set the weights of the flows accordingly to differentiate among applications and/or users.

\(U(t_{ij}) = w_i \times \log(t_{ij})\) (4)

Pricing Model: The notion of pricing the different interfaces based on their consumption can be easily incorporated in our utility framework. The utility of the interface assignment for a user flow \(i\) can be updated as \((U(t_{ij}) - p_j t_{ij})\) where \(E_{ij}\) is the associated cost for flow \(i\) using the interface \(j\) and is defined based on the pricing model of the operator: (i) Pricing per byte: \(E_{ij}\) can be made to capture consumption in the current epoch as \(E_{ij} = C_j t_{ij}\), [8] where \(C_j\) is the cost per unit data of the flow. Since the actual flow throughput in an epoch depends on multiple factors, the cost is typically based on the weights [8], which influences how throughput is shared. (ii) Tiered Data-caps: On the other hand, \(E_{ij}\) can capture data usage till the previous epoch as \(E_{ij} = C_j(D_{ij}^u / D_j^u)\). \(D_{ij}^u\) would now be the cost per unit KB of data (given by dividing the data cap of the user by the monthly cost of the plan), \(D_{ij}\) is the total data usage till the previous epoch for user \(u\) on network \(j\), and \(D_j^u\) is the total number of flows of user \(u\), thereby splitting the cost of a user equally across all its flows. Hence, the associated cost of an interface \(E_{ij}\) is higher for the flows of the users with higher data usage in the past on that interface. Instead of a linear function, one could also consider other functions of data usage. Note that the pricing is mainly used to serve as a deterrent in picking an interface. By appearing as a constant in a given epoch, it does not directly influence the per-epoch optimization problem.

4.1 Problem Hardness¶

Considering only the simplest topology with one LTE eNodeB and one WiFi AP, the complexity for solving the problem grows exponentially with the number of user flows. Intuitively, the problem is hard because the correct choice of a network interface for a given user flow depends on the exact composition of other user flows assigned to the APs. Specifically for WiFi, the throughput of a user flow depends on the PHY rates of the other users attached to the AP (throughput fairness) and in the case of LTE, the throughput of a user flow depends on the weights of the other users attached to the eNodeB (see Equations (2) and (3)). The proof that problem 1 is NP-Hard for a network of an LTE basestation and a WiFi AP is deferred to the Appendix. The complexity of the problem further increases when considering multiple WiFi APs within an LTE cell, especially the case where some of the APs may have overlapping coverage. Note that Problem 1 is NP-Hard even for the case where the WiFi APs employ Proportional Fair scheduling. However, for a certain case when the user weights are unity and both LTE and WiFi perform proportional fairness scheduling, the problem is optimally solvable (proof similar to the load balancing problem in [20]). But this case is not applicable to ATOM, since ATOM is designed to provide differentiated QoS to applications and users.

4.2 Algorithm¶

NIA employs a practical yet efficient greedy algorithm. The algorithm executes in two steps as shown in Algorithm 1. It takes as input the number of active user flows \(N\), the number of WiFi APs \(B\) within the coverage of the LTE cell, the subset of active user flows \(S_L\) that are within the coverage of the WiFi APs and the subset \(S_0\) that includes flows which are not in the coverage of any WiFi AP. Note that the sets \(S_L\) may not be independent since some users may be covered by more than one WiFi AP. Initially all active user flows that are not within the coverage of a WiFi AP are assigned to the LTE eNodeB (i.e. \(A_{0,j=0} = S_0\)). \(A_B\) represents the set of all the WiFi APs whose users have not been assigned an interface yet and \(L_j\) represents the set of user flows that belong to a WiFi AP's coverage but have not been assigned an interface yet (i.e. \(L_j \subseteq S_L\)). The final solution is given by the subsets \(A_0\) and \(A_j\) that consist of the flows that are assigned to the LTE cell and WiFi AP \(j\) respectively.

In the outer loop at every step the algorithm considers each WiFi AP \(\in A_B\) in isolation. It then tries all combination of user flows between the LTE cell and a particular WiFi AP. It then finalizes the interface assignment for all the user flows of that WiFi AP, which yields the highest utility among all the WiFi APs that are part of the set \(A_B\) (Step 18). Having fixed the interface assignment for user flows of a WiFi AP in a single round, the initial condition is reset with this assignment. Specifically, the WiFi AP for which the interface assignment is finalized is removed from the set \(A_B\) (Step 19). The user flows that are assigned to an interface are removed from the set \(L_j\) of the other WiFi APs (Step 20) that also cover these flows so that they are not considered in the following steps. These user flows assigned to the LTE basestation are added to the set \(A_0\) (Step 21). The steps are repeated for each of the remaining WiFi APs until the user flows of all WiFi APs have been assigned an interface. As discussed above, since the assignment of user flows to a LTE basestation and a single WiFi AP is also computationally complex, NIA employs a greedy algorithm to compute the assignment in the inner loop (steps 8-17).

The inner loop performs the assignment of user ﬂows for each pair of WiFi AP (whose ﬂows have not been assigned an interface yet) and the LTE basestation. Initially, no user ﬂows are assigned to the WiFi AP (Step 9). The assignment for the LTE basestation (Step 10) is initialized with the user ﬂows that are already assigned to LTE (A 0 ) and the unassigned user ﬂows that are within the coverage of the WiFi AP j (L j ). Starting with these initial assignments, NIA moves user ﬂows one by one from the LTE basestation to the WiFi AP such that the incremental utility is maximized. For each user ﬂow, the incremental utility is the difference in utility with the current assignment (i.e. LTE) and the interface assignment with the user ﬂow moved to the WiFi AP (Step 12). NIA stops moving ﬂows from the basestation to the WiFi AP when none of the remaining ﬂows result in a positive increase in the marginal utility. After this step, NIA commits the utility for the particular WiFi AP as shown in step 16.

Performance Guarantee: Given the complexity of the problem in the general case, it is hard to claim a worst-case guarantee for our algorithm. However, extensive evaluations show that average-case performance is convincing. The algorithm runs in O(B 2 N) where B is the number of WiFi APs and N is the number of ﬂows.