Hyperspectral Image Classification in the Presence of Noisy Labels

Hyperspectral Image Classification in the Presence of Noisy Labels
这篇文章最大的不同是在讨论 nosiy label的问题,和其他文章的角度不同。(纯属学习)

Abstract

Label information plays an important role in a supervised hyperspectral image classification problem. However, current classification methods all ignore an important and inevitable problem—labels may be corrupted and collecting clean labels for training samples is difficult and often impractical.Therefore, how to learn from the database with noisy labels is a problem of great practical importance. In this paper, we study the influence of label noise on hyperspectral image classification and develop a random label propagation algorithm (RLPA) to cleanse the label noise. The key idea of RLPA is to exploit knowledge (e.g., the superpixel-based spectral–spatial constraints) from the observed hyperspectral images and apply it to the process of label propagation. Specifically, the RLPA first constructs a spectral–spatial probability transform matrix (SSPTM) that simultaneously considers the spectral similarity and superpixel-based spatial information. It then randomly chooses some training samples as “clean” samples and sets the rest as unlabeled samples, and propagates the label information from the “clean” samples to the rest unlabeled samples with the SSPTM. By repeating the random assignment (of “clean” labeled samples and unlabeled samples) and propagation, we can obtain multiple labels for each training sample. Therefore, the final propagated label can be calculated by a majority vote algorithm. Experimental studies show that the RLPA can reduce the level of noisy label and demonstrates the advantages of our proposed method over four major classifiers with a significant margin—the gains in terms of the average overall accuracy, average accuracy, and kappa are impressive, e.g., 9.18%, 9.58%, and 0.1043. The MATLAB source code is available at https://github.com/junjun-jiang/RLPA.
Index Terms— Hyperspectral image classification, label propagation, noisy label, superpixel segmentation.

INTRODUCTION

In this paper, we propose to exploit the spectral–spatial constraint-based knowledge to guide the cleansing of noisy labels under the label propagation framework. In particular, we develop a random label propagation algorithm (RLPA).
As shown in Fig. 1, it includes two steps: 1) spectral–spatial probability transform matrix (SSPTM) generation and 2) random label propagation. At the first step, considering that spatial information is very important for the similarity measurement of different pixels [9], [10], [35], [36], we propose a novel affinity graph construction method, which simultaneously considers the spectral similarity and the superpixel segmentation-based spatial constraint. The SSPTM can be generated through the constructed affinity graph. In the second
step, we randomly divide the training database to a labeled subset (with “clean” labels) and an unlabeled subset (without
labels) and then perform the label propagation procedure on the affinity graph to propagate the label information from
the labeled subset to the unlabeled subset. Since the process of random assignment (of clean labeled samples and unlabeled samples) and propagation can be executed multiple times, the unlabeled subset will receive the multiple propagated labels. By fusing the multiple labels of many label propagation steps with a majority vote algorithm (MVA), it can be expected to cleanse the label information. The
philosophy behind this is that the samples with real labels dominate all training classes, and we can gradually propagate the clean label information to the entire data set by random splitting and propagation. The proposed method is tested on three real hyperspectral image databases, namely, the Indian Pines, University of Pavia, and Salinas Scene, and compared with some existing approaches using overall
accuracy (OA), average accuracy (AA), and the kappa metrics. It is shown that the proposed method outperforms these
methods in terms of objective metrics and visual classification map.
The main contributions of this paper can be summarized as follows.

  1. We provide an effective solution for hyperspectral image classification in the presence of noisy labels. It is very
    general and can be seamlessly applied to the current classifiers.
  2. By exploiting the hyperspectral image prior, i.e., the superpixel-based spectral–spatial constraints, we propose a novel probability transfer matrix generation method, which can ensure label information of the same class propagate to each other and prevent the label
    propagation of samples from different classes.
  3. The proposed RLPA method is very effective in cleansing the label noise. Through the preprocess of RLPA, it can greatly improve the performance of the original classifiers, especially when the label noise level is very large.
    在这里插入图片描述

EXPERIMENTS

To demonstrate the effectiveness of the proposed method, we test our proposed framework with four widely used classifiers in the field of hyperspectral image calcification, which are NN [51], SVM [16], RF [14], [15], and ELM [19], [20]. Since there is no specific noisy label classification algorithm for hyperspectral images, we carefully design and adjust some label noise-robust general classification methods to adapt our framework. In particular, the four comparison methods used in our experiments are the following.
1)Noisy Label-Based Algorithm: We directly use the training samples and their corresponding noisy labels to train the classification models using the abovementioned four classifiers.
2) Bagging-Based Classification (Bagging) [52]: The approach of [52] first produces different training subsets by resampling (70% of training samples are selected each time) and then fuses the classification results of different training subsets.
3)Isolation Forest (iForest)[53]: This is an anomaly detection algorithm, and we apply it to detect the noisy label samples. In particular, in the training phase, it constructs many isolation trees using subsamples of the given training samples. In the evaluation phase, the isolation
trees can be used to calculate the score for each sample to determine the anomaly points. Finally, these samples will be removed when their anomaly scores exceed the predefined threshold.
4)RLPA: The proposed random label propagation-based label noise cleansing method operates by repeating the random assignment and label propagation and fusing the label information by different iterations.
实验内容太多,粘贴了一个实验结果。
在这里插入图片描述
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_44790486/article/details/89207103