Paper analysis: Membership Inference Attacks Against Machine Learning Models (understand at a glance)

Paper analysis: Membership Inference Attacks Against Machine Learning Models (understand at a glance, super detailed version)

Abstract: This article is dedicated to exploring how machine learning models can leak information in the training set, focusing on basic member inference attacks , that is, giving a machine learning model and a record to determine whether the sample is used as part of the training set.
We conducted empirical evaluations on classification models trained by "machine learning as a service" providers such as Google and Amazon. Using real data sets and classification tasks, including hospital discharge data sets whose membership is sensitive from a privacy perspective, we show that these models may be vulnerable to member inference attacks . Then we investigate the factors affecting this leakage and evaluate mitigation strategies.

main problem

We study this problem under the most difficult setting conditions, that is, the query of the model by competitors is limited to giving input and returning the output of the model (black box problem). In the case of the training set and the structural position of the model, we face The biggest problem is how to train the attacker model , so this paper proposes a shadow model , through which the attacker is trained.

Since reading this paper for the first time may cause doubts about the role of the shadow model, the editor first summarizes the overall training ideas and experimental process of the paper to facilitate understanding.

Summary of paper ideas

1. Train the shadow model through the constructed training data (similar function to the target model) (construct a shadow model for each type of label of the target model, the more shadow models, the more accurate the attacker)
2. Then perform the attacker on the shadow model Infer to train the attacker (to distinguish whether the output of the shadow model is in the data set of the shadow model)
3. Finally, use the attacker to infer the target model (to distinguish whether the output of the target model is in the training set of the target model)

Simple flow chart of the thesis experiment processInsert picture description here

Member reasoning attack

Member reasoning attack

**Inference basis: **The behavior of machine learning models on the trained data is different from the behavior of the data they "see" for the first time.

The members in the black box setting infer the attack. The attacker uses the data record to query the target model and obtain the model's prediction for the record. Prediction is a vector of probability records belonging to a certain class, one for each class. This prediction vector and the label of the target record are passed to the attack model, and the attack model infers whether the record is in the training data set of the target model.

Generate training data set for shadow model

In order to train the shadow model, the attacker needs training data similar to the target model training data. We have developed several methods to generate such data.

1.Model-based synthesis

**Basic idea: **Intuitively, records classified by the target model with high confidence should be statistically similar to the target's training data set, so as to provide good material for the shadow model.

The synthesis process is divided into two stages: (1) Use the hill climbing algorithm to search the possible data record space, find the input of the target model classification with high confidence; (2) Extract synthetic data from these records. After the process has synthesized records, the attacker can repeat it until the training data set of the shadow model is full.

Insert picture description here
2.Statistics-based synthesis

The attacker may have some overall statistical information about the training data from which the target model is extracted. For example, the attack can know the edge distribution of different features in advance. In our experiment, we generate synthetic training records of the shadow model by independently sampling from the marginal distribution of each feature. The resulting attack model is very effective.

3.Noisy real data

The attacker can access some data similar to the training data of the target model, which can be regarded as a "noise" version. In our experiments on the location dataset, we simulate this by flipping 10% or 20% of randomly selected feature values, and then train the shadow model on the generated noise dataset. In this case, the training data of the target and shadow models are not sampled from the same population or sampled in different ways.

Shadow model trainingInsert picture description here

**Basic idea: **The main idea behind the shadow training technology is that similar models trained on relatively similar data records using the same service behave similarly.
.
Each sample class construct a model more shadow, the shadow of the model, the more accurate trained attacker model

Training of the attacker model

Insert picture description here

The key of the experiment is how to train the attacker , so this paper proposes the shadow model technology. The shadow model and the target model have similar properties. The shadow model is trained through the constructed data set, and then the trained shadow model is used to perform the attacker model. Conduct training. (The specific process is as shown above)

Experimental part

Training set

CIFAR: CIFAR-10 and CIFAR-100 are benchmark datasets used to evaluate image recognition algorithms. CIFAR-10: 32x32 color images, 10 categories, 6000 images per category, a total of 50000 training images and 10000 test images. CIFAR-100 has 100 categories, each with 600 images, 500 training and 100 testing. In this experiment, for CIFAR-10, the training set size is set to 2500, 5000, 10000 and 15000 respectively; CIFAR-100 is 4600, 10520, 19920 and 29540.

Purchases: The shopping history data of thousands of people provided by Kaggle. Each user's record includes his transaction record within one year, including product name, shop, quantity and date. This article uses a simplified version: 19,324 pieces of data, each of which is composed of 600 binary bits, and each bit indicates whether a product has been purchased. Group these data into multiple categories, each of which represents a shopping style. The task of the experiment in this article is to give a 600-bit vector and judge its shopping category. In the experiment, the total number of categories is set to 2, 10, 20, 50, and 100 respectively. In each experiment, 10,000 pieces of data are randomly selected as the training data of the target model, and the rest are used as the test data and the training data of the shadow model.

Locations: Sifang's check-in data. Contains 11592 users and 119744 locations, a total of 1136481 sign-in data. After processing, each record consists of 446 binary bits. All data is clustered into 30 categories. In the experiment, the task is to determine the category of a given vector. 1600 pieces of data are randomly selected as the training data of the target model, and the rest are used as the test data and the training data of the shadow model.

**Texas hospital stays:** Each record contains 4 main attributes, including the cause of injury, diagnosis, treatment procedures, and some genetic information.
The data used in the experiment includes 67,330 records, and each record consists of 6,170 binary bits. Randomly select 10,000 pieces of data as the training data of the target model.

MNIST: 70,000 32x32 handwritten image data, randomly selected 10,000 pieces of data as the training data of the target model.

UCI Adult (Census Income): 48,842 census data, each record consists of 14 attributes, including age, gender, education level, length of work, etc. The task of the experiment is to determine whether a person's annual income exceeds 50K dollars. Randomly select 10,000 pieces of data as the training data of the target model.

Target model

Google Prediction API: There are no user-adjustable parameters, the user uploads the data set, and then obtains the model API.

Amazon ML: Model type cannot be selected, but some parameters can be adjusted. In this experiment, change the maximum number of training data and the number of L2 regular items.

Neural networks: Use Torch7 and the encapsulated nn package to build neural networks

On the CIFAR data set, a CNN containing two convolutional layers and a maximum pooling layer, a fully connected layer of size 128 and a softmax layer is trained. The activation function is Tanh, the learning rate is 0.001, the learning rate decay is 1 e − 07, and the maximum training epochs is 100.

A fully connected neural network with a size of 128 hidden layer and a softmax layer was trained in purchase. The activation function is Tanh, the learning rate is 0.001, the learning rate attenuation is 1 e − 07, and the maximum training epochs is 200.

Experimental setup

The training set and test set of the target and shadow models are randomly selected from the data set, have the same size, and do not cross (data sets of different shadow models can overlap).
The CIFAR dataset only trains local neural networks.

The purchase dataset trains local neural networks, Google and Amazon's machine learning services (the same training set for these three target models).

Other data sets are only trained on Google or Amazon's machine learning services.

Number of shadow models: 100 for CIFAR, 20 for purchase, 10 for Texas hospital stay, 60 for location, 50 for MNIST, and 20 for Adult.

Attack accuracy evaluation

The attacker's goal is to determine whether a given record is part of the target model training data set.
We evaluate this attack by randomly reorganizing records from the target's training and test data sets. In our attack evaluation, we use sets of the same size (that is, the number of members and non-members are equal) to maximize the uncertainty of inference, so the baseline accuracy is 0.5.

Use Precision and Recall to evaluate the attack. Most metrics are reported by class, because the accuracy of an attack can vary greatly from class to class.

Experimental results

See the original text for specific experimental results and parameter analysis. The
influencing factors include the impact of shadow training data, the number of classes and the impact of each type of training data, the impact of overfitting, etc. The original analysis results are very sufficient and can be viewed in the original text.

Summary evaluation

The main novelty of this paper is the proposal and training of the shadow model. The attacker can be trained on the premise of the unknown target model training data set and target model structure, and finally the target model can be attacked.

I am a little white, and my level is limited. If you have any questions, please discuss and thank you for your understanding.

Guess you like

Origin blog.csdn.net/ZXISABOY/article/details/109307172