GTP-C SIGNALING¶
In this section, we capture dynamics of data roaming, focusing on the GTP tunnels the IPX-P manages between roaming partners to enable data communications for the users.
TL; DR
- GTP-C协议: 用于在IPX-P平台上为用户数据建立和拆除GTP隧道
- GTP-C对话 主要包含两类操作: 建立和拆除隧道,即“创建/删除PDP上下文请求”
- PDP (Packet Data Protocol) 上下文请求, 受 GTP-C控制:
- 用户的IP地址
- QoS(服务质量)参数
- 接入点名称(APN)
- 数据传输的隧道信息(如GTP隧道)
- 网络侧的会话状态信息(在GGSN/PGW处)
- PDP常见操作:
- Create PDP Context Request(创建请求)
- 发起一个新的数据连接,建立用户和网络之间的数据通道
- Delete PDP Context Request(删除请求)
- 主动或被动释放一个已经建立的PDP上下文,终止数据会话
- Update PDP Context Request(更新请求)
- 修改现有会话的一些参数,比如QoS或切换到新的网络接入点
- Create PDP Context Request(创建请求)
- 静默漫游:
- 成因: 由于数据漫游费用高昂
- 流量模式: 与物联网设备相似,仅产生信令流量,数据流量极少或没有
Data Roaming Dataset¶
The GTP-C protocol is used for setting up and tearing down GTP tunnels for user data across the IPX-P platform. The data roaming dataset (see Table 1) we collect includes information for a subset of devices that we previously captured in the signaling dataset (Section 4). For the observation period of July 2020, we capture the GTP-C data records from over 3.3M devices operating world-side, in over 170 (visited) countries. Majority of these devices uses SIM cards from operators in Spain (≈2,3million devices) or in Brazil (≈600k devices). Given that the devices from Spain (corresponding to the same IPX-P customer, an IoT service provider) represent approximately 70% of all the devices in this dataset, we focus our analysis on these, to characterize the operations of the IPX-P offering the data roaming service.
GTP-C协议用于在IPX-P平台上为用户数据建立和拆除GTP隧道。
我们收集的数据漫游数据集(见表1)涵盖了此前在信令数据集中(第4节)捕获的部分设备。2020年7月的观测期间,我们采集了来自全球超过170个国家、逾330万台设备的GTP-C数据记录。其中大多数设备来自西班牙(约230万台设备)和巴西(约60万台设备)的运营商SIM卡用户。鉴于这些西班牙设备(隶属于同一IPX-P客户,即一家物联网服务提供商)占据整个数据集约70%的份额,我们将分析重点放在这部分数据上,以刻画提供数据漫游服务的IPX-P平台的运行特征。
Figure 10a shows the breakdown of this set of devices per visited country. We note that all these devices are IoT devices, serving different verticals. We observe that the main areas of operation for this set of devices include the UK (40%), Mexico (16%), Peru (11%) and Germany (8%). We notice that the main area of activity is Europe and the Americas (where the IPX-P has important trans-oceanic infrastructure), which is consistent with the previous observations from Section 4.
Figure 10b-10c shows the number of active devices and total number of GTP-C dialogues they trigger per hour in the top five visited countries, respectively. We notice a daily pattern for both metrics. Also, during the weekend the number of active devices and overall data roaming activity decreases (the grey area in the timeseries plots).
图10a展示了这些设备在访问国家的分布情况。需要指出的是,所有这些设备均为物联网设备,服务于不同的垂直行业。我们观察到,这批设备的主要活跃区域包括英国(40%)、墨西哥(16%)、秘鲁(11%)和德国(8%)。这些活动区域主要集中在欧洲和美洲,与第4节中关于IPX-P平台拥有重要跨洋基础设施的观察结果一致。
图10b和10c分别展示了在访问量前五的国家中每小时的活跃设备数量和GTP-C对话总数。可以明显看出这两个指标均呈现出日常周期性变化,并且在周末期间,活跃设备数量和总体数据漫游活动均有所减少(在时间序列图中以灰色区域标示)。
The two main types of GTP-C dialogues correspond to the procedures to setup and tear down tunnels, namely create/delete PDP context requests. Figure 11 captures the success rate and the error rate of these GTP-C dialogues. The distribution of dialogues on the type of request (create/delete PDP context) is symmetrical, with slightly higher ratio of create PDP context requests (Figure 11a). Interestingly, we notice that many of the devices from the Spanish operator request data roaming connections at the same time, putting a high load on the platform. The synchronicity of the devices comes from the fact that they are IoT devices with pre-determined behavior by the IoT vertical providers (e.g., they might be smart energy meters the energy companies deploy). This brings an important challenge to the IPX-P, since the platform is not dimensioned for peak demand. This results in a decreased success rate (in Figure 11a we notice that the success rate drops below 90% every day at midnight), and an overall larger number of create PDP context requests. Overall, the delete PDP context requests have close to maximum success rate.
GTP-C对话主要包含两类操作:建立和拆除隧道,即“创建/删除PDP上下文请求”。
图11展示了这些GTP-C对话的成功率和错误率。图11a显示, 对话请求类型的分布较为对称,但“创建PDP上下文请求”略高。
什么是 PDP上下文请求
PDP 是 Packet Data Protocol
的缩写,是一种通信的信令交互
PDP上下文(PDP Context)是指一个 在用户设备(UE)和核心网之间建立的数据会话的状态 信息
它包含了:
- 用户的IP地址
- QoS(服务质量)参数
- 接入点名称(APN)
- 数据传输的隧道信息(如GTP隧道)
- 网络侧的会话状态信息(在GGSN/PGW处)
PDP上下文请求类型
PDP上下文请求通常由 GTP-C (GPRS Tunneling Protocol - Control Plane)
协议控制
包括以下几种操作:
- Create PDP Context Request(创建请求)
- 发起一个新的数据连接,建立用户和网络之间的数据通道
- Delete PDP Context Request(删除请求)
- 主动或被动释放一个已经建立的PDP上下文,终止数据会话
- Update PDP Context Request(更新请求)
- 修改现有会话的一些参数,比如QoS或切换到新的网络接入点
值得注意的是,我们观察到来自该西班牙运营商的 许多设备在相同时间发起数据漫游连接请求,从而对平台产生了较高的负载。这种设备间的同步行为源于其物联网特性 ,即由其垂直行业提供商预设的行为模式(例如它们可能是能源公司部署的智能电表)。
这给IPX-P平台带来了重大挑战,因为平台未针对高峰时段进行容量规划。这种同步行为导致每天午夜左右“创建PDP上下文”的成功率下降至90%以下,同时相关请求数量显著上升。相比之下,“删除PDP上下文请求”几乎总能成功完成。
We further investigate the different errors the unsuccessful dialogues include (Figure 11b). The Signaling timeout error has the lowest rate (affecting 1 in 1000 GTP-C requests), showing that it is rare that a Create PDP Context request remains unanswered and times-out. Once a data communication is successfully established, it may be terminated because of lack of data transfer, generating a Data Timeout error. This error does not imply that there is something technically wrong with the data communication. We see this occurs for approximately 1 in 100 data communications. Interestingly, we note a clear increase of this type of error during the weekends (corresponding to the grey areas in the time-series). The Delete PDP Context request may result in an "Error Indication" result, when the operation is unsuccessful. This affects 1 in 10 such requests, and shows a clear daily pattern. Finally, the Context Rejection presents the same pattern with the Create PDP Context time-series, confirming that the IPX-P cannot respond the synchronized behavior of groups of IoT devices.
图11b进一步分析了失败对话所包含的不同错误类型。信令超时错误发生率最低,每1000个GTP-C请求中约有1个出现此类错误,说明创建PDP上下文请求被长期搁置而最终超时的情况较为罕见。一旦数据通信建立成功,若长时间未传输数据可能会因“数据超时”而终止连接。这种错误并不代表通信技术本身存在故障,其发生概率约为每100次连接中有1次,且在周末期间该错误明显增多(对应时间序列中的灰色区域)。删除PDP上下文请求有时会返回“错误指示”结果,约10%的此类请求会失败,且呈现明显的日周期性变化。最后,“上下文拒绝”错误的时间序列模式与“创建PDP上下文”一致,再次印证了IPX-P平台难以应对大规模物联网设备的同步行为。
Takeaway: Signaling traffic for data communications over the IPX-P shows daily and weekly patterns. Synchronized PDP context requests from devices with similar behavior (e.g., IoT devices such as smart energy meters) put a very large stress on the IPX-P platform, resulting in high Context Rejection (≈10% of requests are rejected).
结论:IPX-P平台上的数据通信信令流量呈现出明显的日/周周期性。具有相似行为的设备(如智能电表等物联网设备)同步发起的PDP上下文请求对平台带来极大压力,导致上下文拒绝率升高(约10%的请求被拒)。
GTP-C Performance¶
Leveraging the data roaming dataset (Table 1) from December 2019 we now characterize the performance of the GTP tunnel management of the IPX-P. Specifically, in Figure 12a we investigate the tunnel setup delay (the time between a PDP Create request and its reply) and the total GTP tunnel duration (the time between a PDP Create and the corresponding PDP Delete event) as they have a strong correlation with the load on the IPX-P, HMNO and VMNO systems. We use the data from December 2019 to avoid the impact of the travel restrictions imposed to tackle the COVID-19 pandemic.
基于2019年12月收集的数据漫游数据集(见表1),我们进一步刻画IPX-P平台对GTP隧道管理的性能表现。具体而言, 图12a展示了隧道建立延迟(即从PDP创建请求发送到响应收到所需的时间)以及GTP隧道的总持续时间(即从PDP创建到对应的删除事件之间的时间间隔), 因为这些指标与IPX-P、母国移动网络运营商(HMNO)以及访问国移动网络运营商(VMNO)的负载密切相关。我们选取2019年12月的数据,以避开COVID-19疫情相关的旅行限制所带来的影响。
The tunnel setup delay is an indicator for the amount of processing involved at different network elements for Create PDP messages (as well as the general processing load). In Figure 12a (green line), we notice the average setup delay (≈150ms) depends on the total number of devices requesting a data connection at a moment in time. This value is consistent with the setup delay values we capture in July 2020. We also note that, in 80% of cases, we measure a tunnel setup delay below 1 second.
隧道建立延迟反映了处理创建PDP消息时涉及的各网络节点的处理负载
图12a中绿色曲线显示,平均建立延迟约为150毫秒,该值受同时请求数据连接的设备数量影响。该延迟值与我们在2020年7月采集到的数据一致。值得注意的是,在80%的情况下,隧道建立延迟低于1秒。
A decrease in the average tunnel duration will increase the number of total tunnels and thus also the volume of signaling messages and the necessary processing for these messages. Conversely, longer tunnel duration cause an increased overall memory footprint in the involved nodes to store the PDP Contexts. When verifying the total tunnel duration, we note that on median, the duration of the GTP tunnel is approximately 30 minutes (Figure 12a red line). Private conversation with operational teams confirmed us that these values for these metrics are an indication of healthy systems, i.e., processing and storage load at IPX-P, MVNO, HMNO elements are under normal operational conditions.
隧道平均持续时间的下降将导致隧道总数上升,从而引发更高的信令消息数量和相关处理负载;而隧道持续时间延长则会使系统中存储PDP上下文的内存负担加重
图12a中红色曲线显示,隧道的中位持续时间约为30分钟。运营团队的私下交流确认了这些指标值处于“健康”状态,意味着IPX-P、MVNO和HMNO系统的处理能力与存储资源处于正常运行水平。
One likely important factor that influences both these metric is the device type, e.g. phone or IoT Operating System (OS). For instance, the OS implementation decides when the device should establish a mobile data connection, how long the connection is held, or which mobile technology takes preference. Since this ecosystem is extremely varied, we are here interested in the aggregated impact on the IPX-P and MNO systems. The granular analysis of the device type impact on these metrics is outside the scope of this analysis.
影响上述两个指标的一个重要因素可能是设备类型,例如智能手机或物联网操作系统。操作系统的实现决定了设备何时建立数据连接、连接持续时间长短、优先使用哪种移动技术等。由于该生态系统极其多样化,本文聚焦于这些多样化实现对IPX-P和MNO系统的总体影响,对设备类型对性能的细粒度影响分析不在本分析范围之内。
Takeaway: The load on the platform, in terms of number of tunnels and PDP Create/Delete requests, impacts the speed to bring up tunnels for new data communications that customers request. The IPX-P maintains a healthy system with similar values for both analyzed datasets.
结论:平台负载(如隧道数量及PDP创建/删除请求量)影响用户数据通信的建立速度。 IPX-P平台在两个不同时间段内展现出一致且健康的运行状态。
Silent Roamers¶
Despite the dynamic global movement of mobile subscribers, not all might be active in terms of data communications. Data communications while roaming have often generated bill shock for mobile subscribers or kept roamers silent (i.e., they do not trigger data communications over cellular networks). Thus, when traveling to a foreign country, mobile subscribers often turn off the data communication capabilities of their devices to avoid high bills. Even if this may no longer be the case for roamers inside Europe [9], we find that majority of roamers within Latin America are still silent.
尽管全球移动用户在地理位置上具有高度流动性,但并非所有人在漫游期间均有数据通信活动。数据漫游可能引发高额账单,使得部分用户选择“静默漫游”(即不使用蜂窝网络进行数据通信)。即便在欧洲范围内已基本消除此问题 [9],我们发现拉丁美洲的漫游用户仍普遍保持静默状态。
By contrasting the mobility of users from signaling dataset (regardless the radio access technology) with the activity we register in the data roaming dataset (active GTP tunnels, see Section 4), we are able to quantify the amount of silent roamers. For the first two weeks of December 2019, we capture the signaling activity of ≈ 2 million subscribers roaming between the Latin American countries where the IPX-P has significant volume of subscribers (Brazil, Argentina, Colombia, Costa Rica, Ecuador, Peru and Uruguay). Out of these, we find that ≈400,000 mobile devices only are using data services while traveling abroad within Latin America. For these, we observe in Figure 12b that the amount of total traffic volume per session (uplink or downlink) is no more than 100KB, in average, per device.
通过将信令数据集中用户的移动行为(不论其接入技术)与数据漫游数据集中活跃GTP隧道的信息(参见第4节)进行比对,我们能够量化静默漫游用户的比例。在2019年12月的前两周,我们捕获了约200万名漫游于拉丁美洲各国(巴西、阿根廷、哥伦比亚、哥斯达黎加、厄瓜多尔、秘鲁和乌拉圭)的用户的信令活动。其中,只有约40万台设备在境外使用了数据服务。从图12b中可以看出,这些设备每次数据会话(上行或下行)的平均总流量不超过100KB。
Even more, when focusing on inbound roamers in Latin America (regardless the home country), we also capture ≈2,5 million IoT devices provisioned by one of the IPX-P’s M2M customers. The latter provisioned the IoT devices operating in Latin America with connectivity from a Spanish MNO. We compare the amount of traffic each roamer within Latin America generates with the amount of traffic from IoT devices (Figure 12b). We find that, although "things" generate very little traffic, mobile subscribers within Latin America have a very similar behavior (though tend to transfer slightly larger data volumes than IoT devices). We conjecture this is the result of the lack of regulation on roaming within the region, as well as the socio-economic landscape, which keeps the cost of roaming data communications prohibitive.
进一步地,在分析拉丁美洲的入境漫游设备时(无论其归属国家),我们还识别出约250万台物联网设备,这些设备由一家IPX-P的M2M客户提供,并通过一家西班牙MNO连接。在对比拉丁美洲地区漫游用户与这些物联网设备的数据流量后(图12b),我们发现尽管“物”产生的流量非常少,但人类用户的行为与其非常接近,仅略微高于物联网设备的传输量。我们推测,这种相似性归因于该地区缺乏统一的漫游监管机制,以及社会经济因素使得数据漫游费用仍然高昂。
Takeaway: Silent roamers are still a phenomenon we observe, especially in Latin America, where roaming charges are high. Their traffic patterns are similar to IoT devices, generating signaling traffic, but very little or no data traffic.
结论:静默漫游 用户现象在拉丁美洲仍普遍存在,主要由于数据漫游费用高昂。这些用户 的流量模式与物联网设备相似,仅产生信令流量,数据流量极少或没有。