基于移动网络测量数据的数据分析

最近在做的一个数据分析的工作,是基于一些众包测量软件采集的移动终端上报的网络测量数据,进行数据分析,查看不同区域的网络覆盖情况和网络质量,对比不同运营商的网络,从而为业务运营提供决策需要的信息。

数据预处理

首先是对原始数据进行数据预处理,提取有用的字段,并进行相关的清洗。这里我采用SPARK来作为预处理的工具。SPARK是我最喜爱的大数据处理平台。

这里以6月10日的原始数据为例,共有3个文件。总共有1521601条数据

df1 = spark.read.csv('nir_0610_1.csv',header=True)
df2 = spark.read.csv('nir_0610_2.csv',header=True)
df3 = spark.read.csv('nir_0610_3.csv',header=True)
df = df1.union(df2).union(df3)
df.count()

我们取一条数据,看看里面包含了哪些字段

df.first()
Row(NIRId='3d4ae4f9bdc2df2a3561a45c912eaa2b69a822920e2f2b34c356802e9deb7787', Hashed_GUID='d0c5a18cee3c80a3ebfe193c127e1171e01269e188a640d2449b4d8a98b1e70f', SDK_Version='20191223173000', TestTimestamp='2020-06-09 21:33:45.820 +0800', ScreenState='On', LocationTimestamp='2020-06-09 21:33:45.821 +0800', LocationProvider='Gps', LocationLatitude='44.838', LocationLongitude='-0.5970703', LocationAltitude='0.0', LocationTransX='-66465.55917418549', LocationTransY='5596054.05133136', LocationAccuracyVertical='0.02082', LocationAccuracyHorizontal='0.02082', LocationAge='0', LocationSpeed='0.0206', LocationCountry='France', LocationCity=None, RadioCdmaBaseStationId=None, RadioCdmaBaseStationLatitude=None, RadioCdmaBaseStationLongitude=None, RadioCdmaNetworkId=None, RadioCdmaSystemId=None, RadioConnectionType='WiFi', RadioFlightMode='Disabled', RadioGsmCellId='20391', RadioGsmCellIdAge='-1', RadioGsmLAC='39395', RadioMCC='460', RadioMNC='00', RadioBrand='China Mobile', RadioCountry='China', RadioNetworkType='EDGE', RadioNetworkGeneration='2G', RadioHRCellId='20391', RadioRNCId=None, RadioeNodeBId=None, RadioOperatorName='CHINA MOBILE', RadioServiceState='Unknown', RadioRXLevel='2147483647', RadioRXLevelAge='24003269', RadioRSCP='2147483647', RadioARFCN='-1', RadioEcN0='2147483647', RadioPrimaryScramblingCode='0', RadioLteCqi='2147483647', RadioLteRsrp='2147483647', RadioLteRsrq='2147483647', RadioLteRssnr='2147483647', RadioLteRssi='2147483647', RadioNrCsiRsrp='2147483647', RadioNrCsiRsrq='2147483647', RadioNrCsiSinr='2147483647', RadioNrSsRsrp='2147483647', RadioNrSsRsrq='2147483647', RadioNrSsSinr='2147483647', RadioNrState='Unknown', RadioNrAvailable='Unknown', RadioIsRoaming='false', RadioMobileDataEnabled='Disabled', RadioMobileDataConnectionState='Disconnected', RadioMissingPermission='false', RadioCarrierAggregation='Unknown', WifiState='Unknown', WifiRxLev='0', WifiFrequency='0', WifiKeyManagement='Unknown', WifiPairwiseCipher='Unknown', WifiAuthAlgorithm='Unknown', WifiGroupCipher='Unknown', WifiProtocol='Unknown', WifiLinkSpeedValue=None, WifiLinkSpeedUnit=None, WifiSupplicantState='Unknown', WifiDetailedState='Unknown', WifiMissingPermission='true', SimState='Ready', SIM_Country='China', SIM_Brand='China Mobile', SimOperator='46000', SimOperatorName='China Mobile GSM', DeviceManufacturer='nubia', DeviceName='NX629J', DeviceTAC=None, OS='Android', OSVersion='5.1.1', RXLevFaulty='false', MobileAnalysisValid='false')

在这些字段里面,RxLevel, Rsrq, Timestamp, Operator这些都是有用的信息。因此把这些字段提取出来

df = df.select(
    F.date_format(df.LocationTimestamp, 'yyy-MM-dd H:m:s').alias('Timestamp'), 
    df.LocationLatitude.cast(DoubleType()).alias('Latitude'), 
    df.LocationLongitude.cast(DoubleType()).alias('Longitude'),
    df.RadioGsmCellId.alias('CellId'),
    df.RadioNetworkType.alias('NetworkType'),
    df.RadioRXLevel.cast(IntegerType()).alias('RXLevel'),
    df.RadioLteRssi.cast(IntegerType()).alias('Rssi'),
    df.RadioLteRsrp.cast(IntegerType()).alias('Rsrp'),
    df.RadioLteRsrq.cast(IntegerType()).alias('Rsrq'),
    df.SimOperatorName.alias('Operator')
)

过滤测量值为空的记录,总共有807359条记录:

df = df.dropna().filter('RXLevel<2147483647 and Rssi<2147483647 and Rsrp<2147483647 and Rsrq<2147483647')
df.count()

查询这些记录中有哪些是在广州范围内的:

df.createOrReplaceTempView("nir")
df_guangzhou = spark.sql("select * from nir where Latitude>23.0309 and Latitude<23.1768 and Longitude>113.1754 and Longitude<113.4535")
df_guangzhou.count()

查看一下Operator字段有哪些取值,并转换为CM, CT, CU

df_guangzhou.select('Operator').distinct().collect()
df_guangzhou = df_guangzhou.replace(['中国移动','China Mobile','CMCC'],'CM')
df_guangzhou = df_guangzhou.replace(['China Unicom','CHN-UNICOM','中国联通'],'CU')
df_guangzhou = df_guangzhou.replace(['中国电信', '中國電信'],'CT')

因为有可能存在同一个位置同一时段有多个网络测量值的情况,这里取最小值:

df_guangzhou = df_guangzhou.groupBy(['Timestamp','Latitude','Longitude','CellId','Operator','NetworkType']).agg({'RXLevel':'MIN','Rssi':'MIN','Rsrp':'MIN','Rsrq':'MIN'})

处理后的数据,我们查看一下每个字段的数值分布统计概况:

df_guangzhou.summary().show()
+-------+------------------+-------------------+-------------------+-------------------+--------+-----------+------------------+------------------+-------------------+------------------+
|summary|         Timestamp|           Latitude|          Longitude|             CellId|Operator|NetworkType|         min(Rssi)|      min(RXLevel)|          min(Rsrq)|         min(Rsrp)|
+-------+------------------+-------------------+-------------------+-------------------+--------+-----------+------------------+------------------+-------------------+------------------+
|  count|             97403|              97403|              97403|              97403|   97403|      97403|             97403|             97403|              97403|             97403|
|   mean|              null| 23.116824006426913| 113.29375649205879|1.363676632601357E8|    null|       null|  41.4338778066384|-84.24788764206441|-11.413570423908915|-84.24788764206441|
| stddev|              null|0.02389730303418188|0.06606026937315002| 3.92988181856844E7|    null|       null|15.474359788580816| 8.132805186632478| 2.4022781810749088| 8.132805186632478|
|    min|2020-06-07 23:10:0|          23.030993|          113.17541|                 -1|      CM|       GPRS|               -87|              -123|                -20|              -123|
|    25%|              null|          23.106022|          113.24244|       1.23044609E8|    null|       null|                29|               -89|                -14|               -89|
|    50%|              null|          23.106195|         113.259834|       1.23577394E8|    null|       null|                31|               -85|                -10|               -85|
|    75%|              null|          23.128687|          113.33528|        1.2357888E8|    null|       null|                56|               -85|                -10|               -85|
|    max| 2020-06-09 9:7:44|          23.176792|         113.453476|           94997826|      CU|    Unknown|                99|               -46|                 -3|               -46|
+-------+------------------+-------------------+-------------------+-------------------+--------+-----------+------------------+------------------+-------------------+------------------+

处理后的数据可以写入为一个CSV文件

df_guangzhou.coalesce(1).write.csv('nir_processed_0610', header=True)

因为数据里面的GPS坐标是WSG84标准的,后续我们会在百度地图里面进行可视化呈现,需要把这些坐标转换为百度的地理坐标。这里调用百度的API来进行批量的转换,转换后的数据重新保存为一个新的文件。

with open('nir_processed_0610.csv') as f:
    lines = f.readlines()
lines = lines[1:]

transfered_coords = []
for i in range(len(lines)//100):
    coords = []
    for j in range(100):
        record = lines[i*100+j].split(',')
        coords.append(record[2]+','+record[1])
    all_coords = ';'.join(coords)
    result = requests.get(base_url+all_coords)
    try:
        result = json.loads(result.text)
    except JSONDecodeError:
        print(result.text)
    for r in result['result']:
        transfered_coords.append((r['y'], r['x']))
coords = []
for i in range(len(lines)%100,0,-1):
    record = lines[-i].split(',')
    coords.append(record[2]+','+record[1])
all_coords = ';'.join(coords)
result = requests.get(base_url+all_coords)
result = json.loads(result.text)
for r in result['result']:
    transfered_coords.append((r['y'], r['x']))

new_lines = 'Timestamp,Latitude,Longitude,CellId,Operator,NetworkType,Rssi,RxLevel,Rsrq,Rsrp\n'
for i in range(len(lines)):
    record = lines[i].split(',')
    record[1] = str(transfered_coords[i][0])
    record[2] = str(transfered_coords[i][1])
    new_record = ','.join(record)
    new_lines += new_record
with open('nir_processed_transfered_0610.csv', 'w') as f:
    f.write(new_lines)

数据可视化

数据处理完成后,我们可以在地图上来进行呈现,这里选择了百度的地图开放平台来进行呈现。

新建一个HTML网页,代码如下:

<!DOCTYPE html>
<html>
<head>
	<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
	<meta name="viewport" content="initial-scale=1.0, user-scalable=no" />
	<style type="text/css">
		body, html,#allmap {width: 100%;height: 100%;overflow: hidden;margin:0;font-family:"微软雅黑";}
	</style>
    <script type="text/javascript" src="http://api.map.baidu.com/api?v=3.0&ak=XXXXX"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/paho-mqtt/1.0.1/mqttws31.min.js" type="text/javascript"></script>
	<title>P3数据可视化</title>
</head>
<body>
	<div id="allmap"></div>
    <script type="text/javascript">
        var xmlhttp = new XMLHttpRequest();
        var markers_rendered = [];
        var measurements = [];
        var selected_measurements = [];
        var flag_data_loaded = false;
        var max_markers_num = 200;

        var map = new BMap.Map("allmap");
        var icon_cm_red = new BMap.Icon("circle_red.png", new BMap.Size(15,15));
        var icon_cm_blue = new BMap.Icon("circle_blue.png", new BMap.Size(15,15));
        var icon_cm_green = new BMap.Icon("circle_green.png", new BMap.Size(15,15));
        var icon_ct_red = new BMap.Icon("tri_red.png", new BMap.Size(15,15));
        var icon_ct_blue = new BMap.Icon("tri_blue.png", new BMap.Size(15,15));
        var icon_ct_green = new BMap.Icon("tri_green.png", new BMap.Size(15,15));
        var icon_cu_red = new BMap.Icon("rect_red.png", new BMap.Size(15,15));
        var icon_cu_blue = new BMap.Icon("rect_blue.png", new BMap.Size(15,15));
        var icon_cu_green = new BMap.Icon("rect_green.png", new BMap.Size(15,15));
        var rule_red = -110;  //If RxLevel lower than -110, display as red
        var rule_blue = -90;  //If RxLevel between (-110,-90), display as blue

        map.centerAndZoom(new BMap.Point(113.36416733330077,23.124171152139173), 15);
        map.enableScrollWheelZoom(true); 

        xmlhttp.onreadystatechange=state_Change;
        xmlhttp.open("GET",'http://localhost/nir_processed_transfered_0610.csv',true);
        xmlhttp.send(null);

        function state_Change(){
            if (xmlhttp.readyState==4){// 4 = "loaded"
                if (xmlhttp.status==200){
                    measurements_txt = xmlhttp.responseText;
                    records = measurements_txt.split("\n");
                    for(i=1;i<records.length;i++){
                        record = records[i].split(",");
                        var longitude = parseFloat(record[2]);
                        var latitude = parseFloat(record[1]);
                        var record_title = record[0]+"\nCellId:"+record[3]+"\nOperator:"+record[4]+"\nType:"+record[5]+"\nRXLevel"+record[7]+"dbm\nRssi:"+record[6]+"\nRsrq:"+record[8]+"\nRsrp:"+record[9];
                        measurements.push([latitude, longitude, record_title, record[4],parseInt(record[7])]);
                    }
                    flag_data_loaded = true;
                    load_data(map.getBounds());
                }
            }
        }

        function load_data(bounds){
            if (flag_data_loaded) {
                var sw_point = bounds.getSouthWest();
                var ne_point = bounds.getNorthEast();
                var bound_lng_min = sw_point.lng;
                var bound_lng_max = ne_point.lng;
                var bound_lat_min = sw_point.lat;
                var bound_lat_max = ne_point.lat;
                selected_measurements.length = 0;
                for(i=1;i<measurements.length;i++){
                    var lat = measurements[i][0];
                    var lng = measurements[i][1];
                    if (lat<=bound_lat_max && lat>=bound_lat_min && lng<=bound_lng_max && lng>=bound_lng_min){
                        selected_measurements.push(measurements[i]);   
                    }
                }
                markers_rendered.length = 0;
                map.clearOverlays();
                if (selected_measurements.length>max_markers_num){
                    var percent = max_markers_num/selected_measurements.length;
                    for (i=1;i<selected_measurements.length;i++){
                        var random_num = Math.random();
                        if (random_num <= percent){
                            var lat = selected_measurements[i][0];
                            var lng = selected_measurements[i][1];
                            var markerIcon;
                            switch (selected_measurements[i][3]){
                                case "CM":
                                    if (selected_measurements[i][4]<rule_red){
                                        markerIcon = icon_cm_red;
                                    }
                                    else if (selected_measurements[i][4]>=rule_blue){
                                        markerIcon = icon_cm_green;
                                    }
                                    else {
                                        markerIcon = icon_cm_blue;
                                    }
                                    break;
                                case "CU":
                                    if (selected_measurements[i][4]<rule_red){
                                        markerIcon = icon_cu_red;
                                    }
                                    else if (selected_measurements[i][4]>=rule_blue){
                                        markerIcon = icon_cu_green;
                                    }
                                    else {
                                        markerIcon = icon_cu_blue;
                                    }
                                    break;
                                default:
                                if (selected_measurements[i][4]<rule_red){
                                        markerIcon = icon_ct_red;
                                    }
                                    else if (selected_measurements[i][4]>=rule_blue){
                                        markerIcon = icon_ct_green;
                                    }
                                    else {
                                        markerIcon = icon_ct_blue;
                                    }
                            }
                            var marker = new BMap.Marker(new BMap.Point(lng, lat), {icon:markerIcon});
                            marker.setTitle(selected_measurements[i][2]);
                            map.addOverlay(marker);
                            markers_rendered.push(marker);
                        }
                    }
                }
            }
        }

        function on_map_drag(type, target, pixel, point){
            var bounds = map.getBounds();
            load_data(bounds);
        }

        function on_map_zoom(type, target){
            var bounds = map.getBounds();
            load_data(bounds);
        }
        map.addEventListener("dragend", on_map_drag);
        map.addEventListener("zoomend", on_map_zoom);
    </script>
</body>
</html>

在本地起一个WEB服务,然后访问这个网页,即可看到在百度地图上会呈现这三个运营商的网络测量数据,如以下的演示,这样可以帮助我们直观的理解在不同区域的网络情况。

数据分析

我们可以进一步来对数据进行不同维度的统计分析,以更好的了解网络情况。

加载之前预处理后的数据,总共有97403条数据

spark = SparkSession.builder.appName('test').getOrCreate()
df = spark.read.csv('nir_processed_transfered_0610.csv',header=True)
df.count()

转换一下列的数据类型

df = df.select(
    df.Timestamp, 
    df.Latitude.cast(DoubleType()),
    df.Longitude.cast(DoubleType()),
    df.CellId,
    df.Operator,
    df.NetworkType,
    df.Rssi.cast(IntegerType()),
    df.RxLevel.cast(IntegerType()),
    df.Rsrq.cast(IntegerType()),
    df.Rsrp.cast(IntegerType())
)

查看一下网络信号强度RxLevel的分布

df.describe("RxLevel").show()
+-------+------------------+
|summary|           RxLevel|
+-------+------------------+
|  count|             97403|
|   mean|-84.24788764206441|
| stddev| 8.132805186632469|
|    min|              -123|
|    max|               -46|
+-------+------------------+

网络信号质量Rsrq的分布

df.describe("Rsrq").show()
+-------+-------------------+
|summary|               Rsrq|
+-------+-------------------+
|  count|              97403|
|   mean|-11.413570423908915|
| stddev|  2.402278181074898|
|    min|                -20|
|    max|                 -3|
+-------+-------------------+

把数据集按照运营商来进行拆分

df_cm = df.filter(df.Operator=='CM')
df_ct = df.filter(df.Operator=='CT')
df_cu = df.filter(df.Operator=='CU')
pandas_df = df.toPandas()
pandas_cm = df_cm.toPandas()
pandas_ct = df_ct.toPandas()
pandas_cu = df_cu.toPandas()

查看中国移动的RxLevel和Rsrq的分布情况,画出直方图

f, ax = plt.subplots(figsize = (12, 8))
sns.histplot(pandas_cm, x="RxLevel", bins=[-120, -110, -100, -90, -80, -70, -60], log_scale=(False,False),kde=True, ax=ax)

f, ax = plt.subplots(figsize = (12, 8))
x_major_locator = MultipleLocator(2)
ax.xaxis.set_major_locator(x_major_locator)
ax.set_xlim(-20.5, 0.0)
sns.histplot(pandas_cm, x="Rsrq", bins=[-20, -18, -16, -14, -12, -10, -8, -6, -4, -2, 0], log_scale=(False,False),kde=True,ax=ax)

查看中国联通的RxLevel和Rsrq的分布情况,画出直方图

f, ax = plt.subplots(figsize = (12, 8))
sns.histplot(pandas_cu, x="RxLevel", bins=[-120, -110, -100, -90, -80, -70, -60], log_scale=(False,False),kde=True)

f, ax = plt.subplots(figsize = (12, 8))
x_major_locator = MultipleLocator(2)
ax.xaxis.set_major_locator(x_major_locator)
ax.set_xlim(-20.5, 0.0)
sns.histplot(pandas_cu, x="Rsrq", bins=[-20, -18, -16, -14, -12, -10, -8, -6, -4, -2, 0], log_scale=(False,False),kde=True,ax=ax)

查看中国电信的RxLevel和Rsrq的分布情况,画出直方图

f, ax = plt.subplots(figsize = (12, 8))
sns.histplot(pandas_ct, x="RxLevel", bins=[-120, -100, -90, -80, -70, -60], log_scale=(False,False),kde=True,ax=ax)

f, ax = plt.subplots(figsize = (12, 8))
x_major_locator = MultipleLocator(2)
ax.xaxis.set_major_locator(x_major_locator)
ax.set_xlim(-20.5, 0.0)
sns.histplot(pandas_ct, x="Rsrq", bins=[-20, -18, -16, -14, -12, -10, -8, -6, -4, -2, 0], log_scale=(False,False),kde=True)

我们可以进一步探索在相同区域内的不同运营商的网络对比。这里我的做法是把每个测量点的位置坐标转化为Google的S2 Cell ID,比较相同Cell的测量点的不同运营商的网络状况。这里我采用的Cell的级别是13级,大概对应的是0.76-1.59平方公里的范围。

定义一个UDF,用于转换GPS坐标为S2 Cell ID

def cellid_fromS2(lat, lng):
    cellid = s2.S2CellId(s2.S2LatLng_FromDegrees(lat, lng).ToPoint()).parent(13).id()
    return cellid
cellid_udf = udf(cellid_fromS2)
_ = spark.udf.register("cellid_udf", cellid_udf)
df = df.withColumn("S2CellId", cellid_udf(df.Latitude, df.Longitude))
df = df.withColumn("Date", F.date_format('Timestamp', 'MM/dd/yyy HH'))
df = df.withColumn("Hour", F.date_format('Timestamp', 'HH'))
df = df.groupBy(['S2CellId', 'Date', 'Operator', 'Hour']).agg({'RxLevel':'avg', 'Rsrq':'avg'})

df_cm = df.filter(df.Operator=='CM').withColumnRenamed('avg(Rsrq)', 'cm_Rsrq').withColumnRenamed('avg(RxLevel)', 'cm_RxLevel').withColumnRenamed('Hour', 'cm_Hour').withColumnRenamed('Operator', 'cm_Operator')
df_ct = df.filter(df.Operator=='CT').withColumnRenamed('avg(Rsrq)', 'ct_Rsrq').withColumnRenamed('avg(RxLevel)', 'ct_RxLevel').withColumnRenamed('Hour', 'ct_Hour').withColumnRenamed('Operator', 'ct_Operator')
df_cu = df.filter(df.Operator=='CU').withColumnRenamed('avg(Rsrq)', 'cu_Rsrq').withColumnRenamed('avg(RxLevel)', 'cu_RxLevel').withColumnRenamed('Hour', 'cu_Hour').withColumnRenamed('Operator', 'cu_Operator')

对比移动和电信的网络

df_compare_cmct = df_cm.join(df_ct, [df_cm.S2CellId==df_ct.S2CellId], 'inner').drop(df_cm.S2CellId)
df_cm_selected = df_compare_cmct.select(['cm_Operator','cm_Hour', 'cm_Rsrq', 'cm_RxLevel', 'S2CellId']).withColumnRenamed('cm_Hour', 'Hour').withColumnRenamed('cm_Rsrq', 'Rsrq').withColumnRenamed('cm_RxLevel', 'RxLevel').withColumnRenamed('cm_Operator', 'Operator')
df_ct_selected = df_compare_cmct.select(['ct_Operator','ct_Hour', 'ct_Rsrq', 'ct_RxLevel', 'S2CellId']).withColumnRenamed('ct_Hour', 'Hour').withColumnRenamed('ct_Rsrq', 'Rsrq').withColumnRenamed('ct_RxLevel', 'RxLevel').withColumnRenamed('ct_Operator', 'Operator')
df_cmct_selected = df_cm_selected.union(df_ct_selected)
df_cmct_selected = df_cmct_selected.select(
    df_cmct_selected.Hour.cast(IntegerType()),
    df_cmct_selected.RxLevel,
    df_cmct_selected.Rsrq,
    df_cmct_selected.Operator
)

用散点图来查看RxLevel, Rsrq的比较

f, ax = plt.subplots(figsize = (12, 8))
x_major_locator = MultipleLocator(1)
ax.xaxis.set_major_locator(x_major_locator)
ax.set_xlim(-0.5, 24.5)
sns.scatterplot(x="Hour", y="RxLevel", hue="Operator", style="Operator", data=df_cmct_selected.toPandas(), ax=ax)

f, ax = plt.subplots(figsize = (12, 8))
x_major_locator = MultipleLocator(1)
ax.xaxis.set_major_locator(x_major_locator)
ax.set_xlim(-0.5, 24.5)
sns.scatterplot(x="Hour", y="Rsrq", hue="Operator", style="Operator", data=df_cmct_selected.toPandas(), ax=ax)

对比联通和电信的网络Rxlevel, Rsrq

df_compare_cuct = df_cu.join(df_ct, [df_cu.S2CellId==df_ct.S2CellId], 'inner').drop(df_cu.S2CellId)
df_cu_selected = df_compare_cuct.select(['cu_Operator','cu_Hour', 'cu_Rsrq', 'cu_RxLevel', 'S2CellId']).withColumnRenamed('cu_Hour', 'Hour').withColumnRenamed('cu_Rsrq', 'Rsrq').withColumnRenamed('cu_RxLevel', 'RxLevel').withColumnRenamed('cu_Operator', 'Operator')
df_ct_selected = df_compare_cuct.select(['ct_Operator','ct_Hour', 'ct_Rsrq', 'ct_RxLevel', 'S2CellId']).withColumnRenamed('ct_Hour', 'Hour').withColumnRenamed('ct_Rsrq', 'Rsrq').withColumnRenamed('ct_RxLevel', 'RxLevel').withColumnRenamed('ct_Operator', 'Operator')
df_cuct_selected = df_cu_selected.union(df_ct_selected)
df_cuct_selected = df_cuct_selected.select(
    df_cuct_selected.Hour.cast(IntegerType()),
    df_cuct_selected.RxLevel,
    df_cuct_selected.Rsrq,
    df_cuct_selected.Operator
)
f, ax = plt.subplots(figsize = (12, 8))
x_major_locator = MultipleLocator(1)
ax.xaxis.set_major_locator(x_major_locator)
ax.set_xlim(-0.5, 24.5)
sns.scatterplot(x="Hour", y="RxLevel", hue="Operator", style="Operator", data=df_cuct_selected.toPandas(), ax=ax)

f, ax = plt.subplots(figsize = (12, 8))
x_major_locator = MultipleLocator(1)
ax.xaxis.set_major_locator(x_major_locator)
ax.set_xlim(-0.5, 24.5)
sns.scatterplot(x="Hour", y="Rsrq", hue="Operator", style="Operator", data=df_cuct_selected.toPandas(), ax=ax)

结论

从以上分析可以看到,总体的网络信号强度来看,联通的最好,电信次之,移动最后

但是从网络质量来看,移动和联通的质量要比电信好。

备注:以上结论只基于我拿到的一小部分的众包软件测量数据分析得出,不代表各运营商的真实完整的网络情况。

猜你喜欢

转载自blog.csdn.net/gzroy/article/details/108652845