定位系列论文阅读:WiCluster(二): Passive Indoor 2D/3D Positioning using WiFi without Precise Labels

0.Abstract

We introduce WiCluster, a new machine learning (ML) approach for passive indoor positioning using radio frequency (RF) channel state information (CSI).
我们介绍了WiCluster,一种新的机器学习(ML)方法,用于使用射频(RF)信道状态信息(CSI)进行无源室内定位。

WiCluster can predict both a zone-level position and a precise 2D or 3D position, without using any precise position labels during training.
WiCluster既可以预测区域级位置,也可以预测精确的2D或3D位置,在训练过程中无需使用任何精确的位置标签。

Prior CSI-based indoor positioning work has relied on non- parametric approaches using digital signal-processing (DSP) and, more recently, parametric approaches (e.g., fully supervised ML methods).
先前基于csi(信道状态信息)的室内定位工作依赖于使用数字信号处理(DSP)的非参数方法,而最近则依赖于参数方法(例如,全监督ML方法)。

However these do not handle the complexity of real-world environments well and do not meet requirements for large-scale commercial deployments: the accuracy of DSP-based method deteriorates significantly in non-line-of-sight conditions, while supervised ML methods need large amounts of hard-to- acquire centimeter accuracy position labels.
然而,这些方法并不能很好地处理现实环境的复杂性,也不能满足大规模商业部署的要求:基于dsp的方法的精度在非视线条件下显著下降,而监督ML方法需要大量难以获得的厘米精度位置标签。
(传统方法精度不够,监督方法又需要数据集)
In contrast, WiCluster is precise, requires weaker label-information that can be easily collected, and works well in non-line-of-sight conditions. Our first contribution is a novel dimensionality reduction method for charting.
相比之下,WiCluster是精确的,需要更弱的标签信息,可以很容易地收集,并且在非视线条件下工作得很好。我们的第一个贡献是一个新颖的图表降维方法。

It combines a triplet-loss with a multi-scale clustering- loss to map the high-dimensional CSI representation to a 2D/3D latent space.
它结合了三重损失和多尺度聚类损失,将高维CSI表示映射到2D/3D潜在空间

Our second contribution is two weakly supervised losses that map this latent space into a Cartesian map, resulting in meter-accuracy position results.
我们的第二个贡献是两个弱监督损耗,它们将这个潜在空间映射到笛卡尔图中,从而得到米精度的位置结果。
These losses only require simple to acquire priors: a sketch of the floorplan, approximate access-point locations and a few CSI packets that are labeled with the corresponding zone in the floorplan.
这些损失只需要获得简单的先验信息:平面图草图,大概的接入点位置和一些标有平面图中相应区域的CSI包。
Thirdly, we report results and a robustness study for 2D positioning in two single-floor office buildings and 3D positioning in a two-story home.
第三,我们报告了结果和稳健性研究的二维定位在两个单层写字楼和三维定位在一个两层的家庭。

1.INTRODUCTION

1.1逐句翻译

第一段(总体介绍所做工作的情况,大致是用弱标签数据(锚点、平面图、房间级别的位置)得到准确定位的工作)

We introduce WiCluster, a new algorithm that uses self- supervision to learn a model that predicts the precise position of a person in an indoor-environment using channel state information (CSI) from WiFi access points.
我们介绍了WiCluster,这是一种新的算法,它使用自我监督来学习一个模型,该模型使用来自WiFi接入点的信道状态信息(CSI)来预测一个人在室内环境中的精确位置。

We specifically address a real-world deployment use-case, where precise position labels will not be available (since ground truth data- collection is too cumbersome, expensive, and privacy sensitive for consumers) and access points may be in non-line-of sight conditions with respect to each other and with respect to the target (e.g. in different rooms).
我们专门解决了一个现实世界的部署用例,其中精确的位置标签将不可用(因为地面真相数据-收集对于消费者来说太麻烦、昂贵和隐私敏感),并且接入点可能彼此之间和相对于目标(例如在不同的房间)处于非视线条件下。

Our solution generates a topologically accurate 2D or 3D latent-space, which we transport into the real-world space by making use of priors such as the floor-plan of the building, anchors such as locations of the access-points, and landmarks which are a small set of room-level annotations.
我们的解决方案生成了一个拓扑上精确的2D或3D潜在空间,我们利用先验(如建筑的平面图)、锚点(如接入点的位置)和地标(一组房间级注释)将其传输到现实空间中。

We predict a precise position for a single person in a multi-room environment, without using any precise position labels during training, as shown in Fig. 1.
我们在多房间环境中预测单个人的精确位置,在训练过程中不使用任何精确的位置标签,如图1所示

第二段(不是用指纹观测定位一个人,是用两个设备,通过遮挡导致的信号衰弱定位人,也就是不需要任何被定位人参与)

Passive positioning, unlike active positioning, does not involve the participation of the target in the positioning problem, i.e. the target is not required to carry a positioning device such as a mobile phone.
被动定位与主动定位不同,它不需要目标参与定位问题,即目标不需要携带手机等定位设备。

It exploits the perturbations in the propagation of RF signals between a transmitter and a receiver, neither of which are carried by the target, and operates like a bi-static radar.
它利用发射机和接收机之间射频信号传播中的扰动,这两者都不被目标携带,并像双静态雷达一样工作。

Passive positioning using WiFi can be useful in the domain of home, enterprise and industrial automation and robotics. Zone-level positioning refers to the ability to determine whether the target is present in a certain area of a building, such as a room.
使用WiFi的被动定位在家庭、企业和工业自动化和机器人领域非常有用。区域级定位是指确定目标是否存在于建筑物的某个区域(如房间)的能力。

It can enable, for instance, intrusion detection, smart energy usage, building space usage optimization, etc.
例如,它可以实现入侵检测、智能能源使用、建筑空间使用优化等。

第三段(实现了不需要射频信号的黑暗环境下的亚米级定位)

Precise positioning refers to the ability to determine the 2D/3D coordinates of the target within the environment; in this paper we target sub-meter level positioning which can enable, for instance, asset and customer tracking in a business environment, indoor navigation, etc.
精确定位是指在环境中确定目标的2D/3D坐标的能力;在本文中,我们的目标是亚米级定位,它可以实现,例如,在商业环境中的资产和客户跟踪,室内导航等。

It is not affected by poor light conditions and can thus work in the dark, and it can work across walls. Also, a device without a video-camera is more likely to be adopted by a user since it largely mitigates issues of privacy related to video images.
它不受光线条件差的影响,因此可以在黑暗中工作,也可以隔墙工作。此外,没有摄像头的设备更有可能被用户采用,因为它在很大程度上减轻了与视频图像相关的隐私问题。

Passive positioning, and in general RF sensing, is also the subject of the IEEE 802.11bf Task Group, which is currently defining the standard signalling to enable inter-operation across WiFi sensing devices. Such standardization will enable the growth of an ecosystem of sensing devices which will further expand the applicability of the solutions described in this paper.
无源定位,以及一般的射频传感,也是IEEE 802.11bf任务组的主题,该任务组目前正在定义标准信号,以实现跨WiFi传感设备的互操作。这种标准化将使传感设备生态系统的增长成为可能,这将进一步扩大本文所述解决方案的适用性。

第四段(前人的研究成功将空间上相似的聚合在一起,而时间上不能聚合,本文作者提出了用聚类进行空间上聚类)

We build upon Ferrand et al, [1], who use a triplet-loss to train a neural-network in an unsupervised manner to simultaneously learn a spatial similarity metric between CSI samples and to perform dimensionality reduction to a 2D latent space that is topologically close to the true geographic environment.
我们建立在Ferrand等人的基础上,[1],他们使用三重损失以无监督的方式训练神经网络,同时学习CSI样本之间的空间相似性度量,并对拓扑上接近真实地理环境的2D潜在空间进行降维。

We find that by combining a triplet-loss with a cluster prediction loss, we are able to improve upon the 2D latent-space: the triplet-loss encourages the model to learn a representation that brings together CSI samples that are close in time (and thus in space), and the clustering extends this to CSI samples that are close in space but not necessarily close in time.
我们发现,通过结合三重损失和聚类预测损失,我们能够改进2D潜在空间:三重损失鼓励模型学习将时间上接近(因此在空间上)的CSI样本聚集在一起的表示,并且聚类将其扩展到空间上接近但在时间上不一定接近的CSI样本。

第五段

Our contributions are:

(i) a novel dimensionality reduction method for charting. It combines representation learning with a new dimensionality reduction technique in order to embed a high-dimensional representation of the CSI in a 2D latent space, as shown in Fig. 2a.
(i)一种新的图表降维方法。它将表示学习与一种新的降维技术相结合,以便在二维潜在空间中嵌入CSI的高维表示,如图2a所示。

The dimensionality reduction approach preserves cluster membership across dimensions at multiple scales to preserve local and global structure within the data.
降维方法在多个尺度上保持跨维的集群成员关系,以保持数据中的局部和全局结构。

(ii) We introduce two additional weakly-supervised losses (a zone loss and an access-point loss) to transport this latent space into the real-world space, as shown in Fig. 2b, by incorporating some priors in our model.
(ii)我们引入两个额外的弱监督损失(一个区域损失和一个接入点损失),将这个潜在空间运输到现实世界空间,如图2b所示,通过在我们的模型中加入一些先验。

(iii) We evaluate our model and demonstrate state-of-the-art results on three different data-sets that are specifically designed to mimic an actual real-world deployment use-case (i.e. not a simple lab environment).
(iii)我们评估了我们的模型,并在三个不同的数据集上展示了最先进的结果,这些数据集专门用于模拟实际的现实世界部署用例(即不是简单的实验室环境)。
在这里插入图片描述

II. RELATED WORK

Our work builds on both wireless indoor positioning and representation learning.
Classical DSP methods for indoor positioning [8], [11] can determine the location of subjects by assuming that the perturbation of the RF signal propagation, induced by the target motion, follows a known mathematical model. In reality, such models work when the target is within line-of-sight of the transmitter and receiver, but fail in environments with non- line-of-sight propagation and with complex reflection patterns.
经典的DSP室内定位方法[8],[11]可以通过假设目标运动引起的射频信号传播扰动遵循已知的数学模型来确定被测对象的位置。在现实中,这种模型在目标处于发射器和接收器的视距内时有效,但在非视距传播和复杂反射模式的环境中失效

III. METHOD

A. Background

DeepCluster:

这里他讲的比较模糊,可以参考深度聚类论文理解

B. Learning the 2D latent space (self-supervised)

Cluster-loss:

We introduce a way to perform dimensionality reduction using clustering.
聚类损失:我们介绍了一种使用聚类执行降维的方法

第一段(网络直接将射频高维信息提取到二维空间)

First, we extract cluster assignments from a high-D representation of the data. The high-D representation is then projected to a 2D latent-space. A small Multilayer Perceptron (MLP) is attached to the 2D latent-space and the network is trained to predict the cluster assignments with a cross-entropy loss. This results in a 2D latent-space that brings together points that belong to the same cluster in the high-D space and separates them from points that belong to a different cluster.
首先,我们从数据的高d表示中提取聚类分配。然后将高d表示投影到2D潜在空间。将一个小型多层感知器(MLP)连接到二维潜在空间,训练网络预测具有交叉熵损失的聚类分配。这将导致一个2D潜在空间,它将属于高d空间中相同聚类的点聚集在一起,并将它们与属于不同聚类的点分开。
(大约就是使用深度聚类的方案把高维度的东西,特征提取出来,之后把这些特征强行训练到一个二维表达当中。)

第二段(从二维空间直接训练高维度空间效果不好,于是作者设计了一个双向训练方案)

We observe that optimizing the high-D representation indirectly through only the 2D representation would often get stuck in a local minima, owing to the mapping between the two spaces not being bijective [5]. To help the 2D projection unfold the high-D manifold we also allow the network to directly optimize the high-D space: we attach an MLP to the high-D features and try to predict cluster assignments obtained from the 2D projection.
我们观察到,由于两个空间之间的映射不是双射的[5],仅通过二维表示间接优化高d表示往往会陷入局部极小值。为了帮助2D投影展开高d流形,我们还允许网络直接优化高d空间:我们将MLP附加到高d特征上,并尝试预测从2D投影中获得的聚类分配。

Hence, the main component of our model is a cross-dimension architecture, shown in Fig. 4; where from each dimension-projection we predict the clusters obtained from the other.
因此,我们模型的主要组成部分是一个跨维架构,如图4所示;其中,我们从每个维度投影中预测从另一个维度投影中获得的聚类。
在这里插入图片描述
等于是从两个角度进行训练,用高纬度的聚类结果训练提取过的二维度信息,用二维度的聚类结果训练高纬度信息,这样两个都能得到训练。

第三段(聚类聚出来应该是一个分类,不是一个定位结果,这里好像在说怎么解决这个问题)

As the number of clusters, k, is decreased, the size of the neighbourhoods (points within a cluster) will increase and the approach will sacrifice local structure for global structure (since there is no constraint on how to position points within a cluster but only that they should be mapped to the same cluster).
随着聚类数量k的减少,邻域(聚类中的点)的大小将会增加,并且该方法将牺牲局部结构来实现全局结构(因为对于如何在聚类中定位点没有限制,而只是要求它们应该映射到同一个聚类)。

For example, if we assume that we produce a very high number of clusters such that on average two points form a cluster in the high-D representation, then the 2D latent- space will try to preserve the nearest-neighbour. On the other hand, if we assume there are only several clusters, then the 2D latent-space will preserve a more global structure such as which rooms the samples originate from (but not their position in the room).
例如,如果我们假设我们产生了非常多的聚类,这样平均两个点在高d表示中形成一个聚类,那么2D潜在空间将试图保留最近的邻居。另一方面,如果我们假设只有几个簇,那么2D潜伏空间将保留一个更全局的结构,例如样本来自哪个房间(而不是它们在房间中的位置)。

To enforce structure at multiple scales, instead of working with just one set of clusters (per dimension) we extract and predict multiple cluster assignments of different sizes concurrently.
为了在多个尺度上加强结构,我们不只是使用一组集群(每个维度),而是同时提取和预测不同大小的多个集群分配。

Triplet-loss:

第一段(训练时候,只是实现了分类,不能形成拓扑,由于交叉熵损失是想让投影进行正交)

Training a model with only the cluster-loss would generate a 2D representation that has good separation (i.e. points that are far apart in true-space are unlikely to over- lap). However, it is unlikely to be topologically representative of the true environment, since a cross-entropy loss trained for softmax classification will try to encode different classes as orthogonal rays when projected to 2D.
训练一个只有聚类损失的模型将产生一个具有良好分离的2D表示(即在真实空间中相距很远的点不太可能重叠)。然而,它不太可能在拓扑上代表真实的环境,因为用于softmax分类的交叉熵损失在投影到2D时将尝试将不同的类编码为正交射线。

第二段(详细介绍提出的损失函数)

在这里插入图片描述
传统的损失函数都是不能形成有效果的地图的,所以应该换一种损失来做,本文作者提出了一种训练损失,应该让时间戳采集相近的两点二维表征相近,让时间戳距离较远的二维表征相互远离。相当于利用时序信息对结果进行训练。假定i和j采集时间相近,i和k采集时间相互距离较远。xi表示i时刻的射频观测、xj表示j时刻的射频观测,xk表示k时刻的射频观测,mt是不能保证不为0的长两。

第三段(介绍损失函数的假设前提,必须是采集时间接近的两点位置也接近)

This loss incorporates the prior that within a small time- window (e.g a couple of seconds) there exists a linear relationship between time and distance. For example; if we take one CSI sample, we would expect the distance to another CSI sample that is closer in time to also be closer in 2D- space. Assuming a small-enough window for this sampling, the distance assumption would hold even if the person does not walk in a straight-line (e.g. a circle).
这种损失包含了一个先验,即在一个小的时间窗口内(例如几秒钟)存在时间和距离之间的线性关系。例如;如果我们取一个CSI样本,我们会期望与另一个CSI样本的距离在时间上更近,在二维空间上也更近。假设这个采样的窗口足够小,即使这个人不走直线(例如一个圆),距离假设也成立。

第四段(介绍triplet-loss的优越)

The triplet-loss has two distinct advantages:
(i) it helps pull together the separation created by the cross-entropy loss into a path, thus we observe smooth changes in latent space position even when the signal-strength abruptly changes (i.e. the person suddenly walks behind a pillar).
(i)它有助于将交叉熵损失产生的分离聚集成一条路径,因此即使信号强度突然变化(即人突然走到柱子后面),我们也能观察到潜在空间位置的平稳变化。

(ii) The sampling strategy to determine positive and negative anchors uses the timestamp of the CSI sample, which introduces an extra source of information (the ordering of the packets) which is highly correlated with local spatial structure.
(ii)确定正锚和负锚的抽样策略使用CSI样本的时间戳,这引入了与局部空间结构高度相关的额外信息源(数据包的顺序)。

B. Learning the 2D latent space (self-supervised)总结

这里主要是提出两个本文用到的损失函数:

  • 1.聚类部分用到的损失函数

在这里插入图片描述
等于是从两个角度进行训练,用高纬度的聚类结果训练提取过的二维度信息,用二维度的聚类结果训练高纬度信息,这样两个都能得到训练。

  • 2.让其二维输出结果变成物理实际位置的损失函数,作者提出应该让时间戳采集相近的两点二维表征相近,让时间戳距离较远的二维表征相互远离。相当于利用时序信息对结果进行训练。假定i和j采集时间相近,i和k采集时间相互距离较远。xi表示i时刻的射频观测、xj表示j时刻的射频观测,xk表示k时刻的射频观测,mt是不能保证不为0的常量。

C. Learning the Cartesian map (weakly-supervised)

第一段(上面只是形成了一个轨迹,但是并没有形成一个真实的物理意义的区间)

Assuming we have access to a handful of zone-level labels per zone, the 2D latent space produced in a self-supervised manner using the above method can be used to accurately perform zone-level positioning. This is shown in Table I. By incorporating additional real-world information into our model, the 2D-latent space can be transported to the real-world space by introducing two weakly-supervised losses.
假设我们每个区域都有少量的区域级标签,使用上述方法以自监督方式生成的2D潜在空间可以用于准确地执行区域级定位。如表i所示。通过在我们的模型中加入额外的真实世界信息,可以通过引入两个弱监督损失将2d潜在空间传输到现实世界空间。

Zone-loss:

第一段(根据地图真值做了修正)

We assume that a rough floor-plan is provided by the user where each zone is represented by a (bounding) box on a Cartesian map. Using a small-number of zone-level labels and the latent-space representation, we perform a KNN lookup to generate a predicted zone for every CSI sample.
我们假设用户提供了一个粗略的平面图,其中每个区域在笛卡尔图上用一个(边界)框表示。使用少量区域级标签和潜在空间表示,我们执行KNN查找,为每个CSI样本生成预测区域。

We then perform a key-value lookup on the floor-plan which is defined as
然后,我们对定义为的平面图执行键值查找We then perform a key-value lookup on the floor-plan which is defined as
[Bzone] = ([x0, y0], [x1, y1]) i.e. a bounding-box per zone, to retrieve the bounding-box B for the predicted zone. The loss LZ then equates to the Manhattan distance dm (x, x′ ) between the point and the box, if the point is predicted to be outside, and zero otherwise.
每个区域一个边界框,以检索预测区域的边界框B。损失LZ则等于点和盒子之间的曼哈顿距离dm (x, x '),如果点预测在外面,否则为零。

在这里插入图片描述

大约总结一下: 他们有地图的区域真值,判断这个点是不是在这个区域,如果在则为0,不在则为点到框的欧式距离。

Access-Point Loss

第一段

It is well known that the power of a wireless signal decays exponentially as the distance from the source increases. We introduce this into our model by assuming that the precise location of the transmitter and receivers is provided by the user and create an access-point loss.
众所周知,无线信号的功率随着距离源的增加而呈指数级衰减。我们通过假设发送器和接收器的精确位置由用户提供并创建一个接入点丢失来将其引入到我们的模型中。

LA, that operates similar to the triplet-margin loss.
LA这类似于三重边际损失。
在这里插入图片描述
大约总结一下:根据其接入的Wi-Fi节点,判断其位置信息。

D. Combined Loss

The neural-network is trained end-to-end and the final loss term L is a combination of the two unsupervised losses and two weakly-supervised losses:
对神经网络进行端到端的训练,最终的损失项L是两个无监督损失和两个弱监督损失的组合:
在这里插入图片描述
The losses can be summed with equal weights, and we omit an ablation that shows they are robust to small deviations in weights for brevity. However, the loss LA is only used to initialize the model for several epochs before it is turned off.
损失可以用相等的权重求和,为了简洁起见,我们省略了一个消融,这表明它们对权重的小偏差具有鲁棒性。然而,在模型被关闭之前,损失LA仅用于初始化模型几个周期。

联合损失,直接把他们加起来,实际上是加权的相加,但是作者这里省略了。

IV. IMPLEMENTATION DETAILS

a) Data collection: We generate the data-set by using multiple commercial IEEE 802.11 access points operating in the 5GHz band, and deploy them in test environments as shown in Fig. 5 and Fig. 6. Please note that zones are separated by drywall, glass, tall storage cabinets or concrete walls, the access points are in different zones, and several detection areas have no access points in line-of-sight.
a)数据收集:我们使用工作在5GHz频段的多个商用IEEE 802.11接入点生成数据集,并将其部署在测试环境中,如图5和图6所示。请注意,区域由干墙、玻璃、高储物柜或混凝土墙隔开,接入点位于不同的区域,并且几个检测区域在视线范围内没有接入点。

Each of the three receivers uses 8 dipole antennas arranged as a uniform circular array of 4cm radius. The transmitter uses only one of the antennas for transmission. 80MHz BW is used for the transmissions. The CSI is estimated based on the reception of standard WiFi ACK packets, which the transmitter sends at periodic 10ms intervals. The CSI represents the channel between the transmitter antenna and each of its 8 receiver antennas, across 208 frequency tones that span the transmission bandwidth.
三个接收器中的每一个都使用8个偶极天线,排列成半径为4厘米的均匀圆形阵列。发射机只使用其中一根天线进行传输。80MHz BW用于传输。CSI是基于标准WiFi ACK包的接收估计的,发送器以周期性的10ms间隔发送。CSI表示发射天线与其8个接收天线中的每一个之间的信道,跨越跨越传输带宽的208个频率音。

Hence, the CSI is represented as a tensor of complex numbers per each packet ∈ C8×1×208. We process the CSI and use only the magnitude, so that the input to the model uses real numbers.
因此,CSI表示为每个包∈C8×1×208的复数张量。我们处理CSI并只使用幅度,因此模型的输入使用实数。

猜你喜欢

转载自blog.csdn.net/qq_43210957/article/details/129356504