Hierarchical Federated Learning for Edge-Assisted UAV Networks

Hierarchical Federated Learning for Edge-Aided Unmanned Aerial Vehicle Networks

Summary

Federated learning (FL) allows unmanned aerial vehicles (UAVs) to collaboratively train a globally shared machine learning model while retaining their private data locally.
In recent years, the heterogeneous data acquired by edge-assisted unmanned aerial vehicles (UAV) has surged, and a global model with privacy needs to be established.
A key issue is how to deal with the non-independent and identically distributed (non-iid) nature of heterogeneous data while ensuring that the learning of convergence.
To address this issue, a hierarchical federated learning algorithm for edge-assisted UAV networks is proposed,
which utilizes edge servers located at base stations as intermediate aggregators using common shared data.

introduction

1. Motivation

The application of federated learning in UAVs:
It is realized through the following four loop steps:
(i) locally train the global model on each UAV (i.e. local model) using its own data,
(ii) combine the trained The UAV local model is reported to the centralized cloud server,
(iii) the local model is aggregated on the cloud server to update the global model, and
(iv) the updated global model is sent to the UAVs to train it again locally on the UAVs.

Most FL systems use the FedAvg algorithm, which weights the average model update based on the number of training samples.
Disadvantages:
Although FL (no need to send real-time data feeds to cloud servers) can reduce latency and bandwidth, the continuous growth of deep learning (DL) model size makes the transmission of model updates a bottleneck in the system.
How to deal with the non-independent and identically distributed nature of heterogeneous data acquired by various types of UAVs, while ensuring the convergence of training the global model.

2. Related work

Intermediate edge servers located in base stations between UAVs and cloud servers can help significantly reduce the communication and computation costs required for FL.

related work Purpose
Zhang et al. considered a FL system with a Ground Fusion Center (GFC) as an aggregator to deploy a network of UAVs at remote locations Reduce Communication Complexity
Using drones as a communication link between users and edge annotations Alleviate network latency overhead
Several services are formulated: computation offloading, resource allocation, and optimal UAV placement in a mobile edge computing network while using UAVs as communication and computing devices.

Important considerations in UAV networking:
data heterogeneity for generating samples, UAVs are practically limited by memory, communication, computation and energy consumption.
An approach to address the system heterogeneity problem:
use distributed learning to execute the training process across multiple edge devices, selecting skilled devices by predicting outage and resource information of critical infrastructure agents.

Developing robust high-performance federated learning algorithms in non-IID.data distributions for edge UAV networks is still an active research area.

3. Contribution

• Developed a hierarchical FL algorithm that works well in real-world scenarios with non-IID data (ie, highly skewed feature and label distributions).
Innovation point: use public shared data on the edge server to effectively solve the divergence problem caused by the nature of non-independent and identical distribution. Create or build co-shared data offline on edge servers by collecting exemplary data samples from drones.
An efficient method is also proposed to hierarchically aggregate local models of drones and edge servers for global model updating.

System Model and Problem Description

Consider an edge-assisted drone network, which includes a cloud server, L edge servers located in base stations, and N drones. The drones are divided into L groups, and the lth group of drones, denoted by C l with cardinality |C l | = C l , is assigned to the lth edge server.

Let n be the total number of data samples on the drone, where the ith drone has a dataset, denoted P i , consisting of n i data samples. The goal of FL is to minimize the following (global) loss factors:

insert image description here
f(w) represents the loss function of the global model w, and f j (w) is the loss function of the j-th data sample of the i-th drone.

Training process:
(1) The central cloud server sends the global model w to each UAV. (2) In each step t, the i-th UAV locally trains the global model w
using its private dataset P i based on the gradient descent method, thereby generating a local model w i

where η t represents the step size.
(3) After sending the local model {w i } back to the central cloud server, the global model w is updated by the following aggregation: insert image description here
Repeat the above steps until the desired accuracy is achieved.

The heterogeneity of drones leads to poor performance and convergence behavior of federated learning.
There are several ways to bias the data between devices iid:

method features example
feature distribution skew The marginal distributions P i vary from device to device. This means that the data characteristics are different between different devices. Pictures of the same object may vary in brightness, occlusion, camera sensor, etc.
Label distribution is skewed Devices have access to a small subset of all available tags Each device has access to several images of a certain number
Concept transfer (different features, same label) The conditional distribution P i (x 1 y ) varies from device to device. The same label y may have different features x between devices In digit recognition situations, digits may be written in different ways, which results in different underlying characteristics of the same digit
Concept transfer (same features, different labels) The conditional distribution P i (x|y) differs between devices. Similar features may have different labels on different devices Different numbers are written in a very similar way, such as 5 and 6, or 3 and 8

In real-world scenarios, at least each of the above ways can happen in practice, and most datasets usually contain a mixture of them.

Hierarchical FL algorithm

Key idea: Edge servers are used as intermediate aggregators with common shared data to improve learning performance even for non-IID data.
In hierarchical FL, public shared data is used to train local models on edge servers.
It is proposed to hierarchically aggregate UAVs and local models of edge servers.
Algorithm 1 presents the proposed hierarchical FL algorithm for edge-assisted UAV networks, where T is the overall aggregation step. Furthermore, C denotes the proportion of UAVs participating in stratified FL, which are selected from the total N UAVs.

Working principle:
(1) Initialize the local models of drones and edge servers with random weight w 0 , and assign each edge server a common shared dataset Q, which is equivalent to 5% of the total dataset.
(2) UAVs and edge servers start to train their local models (i.e., the global model from the previous round) in parallel using their private and public shared data, respectively.
(3) At each step of global aggregation, UAVs update their models with the global aggregation parameter w t from the previous round.
(4) After k 1 local iterations, each UAV sends its local model w l i trained using the private dataset P i to the edge server. (5) After receiving the local models from the corresponding UAV, the edge server performs edge aggregation where the local model w e l of the edge server is trained using the shared dataset Q, and then these models are aggregated together with the local model of the UAV. (6) After k2 iterations of the edge aggregation process , the edge server sends its aggregated model {w} to the cloud server (7) In the cloud server, the global model w t+1 is obtained through global aggregation . In general, the local update of the i-th UAV assigned to the l-th edge server takes the following form: When the intermediate aggregator cannot perform the edge aggregation process due to insufficient system resources, i.e. the whole process is reduced to the FedAvg equation.





insert image description here
k1=1 and wl(t)=∑a,∈Cl ninl wl i(t)

In any FL algorithm, compared to the centralized learning method, the training accuracy of the machine learning model will decrease due to the weight difference, which is mainly caused by the following two factors:

factor Corresponding Hierarchical FL Algorithm Factors Effect
The initialization of the UAVs model is different during the training process In the local update of UAVsThe number of iterations k1; of the aggregation step in the edge server before transferring the updated result to the global serverTimes k2 Lower values ​​of k1 and k2, i.e. fewer iteration steps between global aggregations, will reduce the communication cost in practice.
The non-iid nature of the underlying data distribution Percentage of shared data Q Edge servers act as aggregators that can independently fine-tune the size of the shared dataset according to the data distribution of the UAVs assigned to them.

Complexity Analysis

The overall communication time complexity of each round of the algorithm is O( CN te + L tc )
. Since the edge server also acts as a base station between the UAV and the central server, the communication time complexity of FedAvg is O( CN ( t e + t c )).
Since the number of activities of users CN > the number L of edge servers, the algorithm generates less communication complexity compared to FedAvg.

Numerical result

Consider two scenarios with different degrees of non-iid data distribution.

Scenes Set up the scheme Purpose network construction
scene one The widely used MNIST dataset is set as a private dataset P i at UAVs and a public shared dataset Q at edge servers; considering the extreme non-iid data distribution, 100 UAVs and 10 edge servers are selected, and each UAVs are given data samples of only one class, and each edge server is assigned 10 UAVs with a total of 2 different classes Well describes the case of label distribution skew (skew), i.e. when each UAV has data samples of only one class, and when each edge server is assigned UAVs with the same label Convolutional neural network (CNN) with four layers: the first two convolutional layers use 10 and 20 filters respectively, with a kernel size of 5, followed by two fully connected layers with 50 and 10 units
scene two Classify 52 handwritten uppercase and lowercase letters using the Federal Extended MNIST (FEMNIST) dataset, in addition to 10 digits, and split the dataset according to the authors of the characters, with an unbalanced number of samples per UAV Study the effect of characteristic distribution deviation on FL, where P i is set to be different among UAVs Similar CNN: two convolutional layers with 32 and 64 filters, kernel size 5, two fully connected layers with 1024 and 62 units

In the neural network model, at each UAV, the local model is updated using stochastic gradient descent, where the batch size is set to 32, the learning rate is set to 0.01, and the exponential weight decay is 0.995 after each step of global aggregation.

In total, 360 UAVs were randomly assigned to 18 edge servers. In Scenario 1 and Scenario 2, 5% of the dataset is selected as the shared dataset for edge servers.
The dynamic nature of drone networks can cause some devices to become bottlenecks in the system (i.e., dispersion effects).

Finally, experiments using a very low value of C = 0.008 demonstrate the robustness of the algorithm to high dropout rates or low participation rates due to discrete effects.

Experimental indicators and results
. . . . . .

in conclusion

A layered FL algorithm for edge-assisted UAV networks is proposed by utilizing shared data on edge servers.
The results show that the hierarchical FL algorithm outperforms existing FL algorithms in two real-world scenarios with non-iid data and a large number of edge servers.
Especially when assigning UAVs with identically labeled data samples to each edge server, where existing FL algorithms cannot achieve the desired level of accuracy.

Literature source

Tursunboev Jamshid et al. Hierarchical Federated Learning for Edge-Aided Unmanned Aerial Vehicle Networks[J]. Applied Sciences, 2022, 12(2) : 670-670.

Guess you like

Origin blog.csdn.net/m0_51928767/article/details/126001218