PyTorch deep learning combat (13) - visualizing the output of the middle layer of the neural network

0. Preface

With the rapid development of deep learning, neural networks have become an important tool for solving various complex tasks. However, the black-box nature of neural networks keeps us from understanding their inner workings and learned representations. To better understand how neural networks work, researchers have proposed various visualization methods to explore the output of intermediate layers of the network. Feature learning is one of the most critical tasks of neural networks. Through layer-by-layer transformation and learning, neural networks can extract high-level, abstract feature representations from raw data, and these feature representations can capture important information in the data. However, the outputs of these intermediate layers are incomprehensible to humans because they are high-dimensional, abstract vectors.
By visualizing the results of feature learning, we can intuitively observe the changes that occur when the network processes data. Using visualization methods, we can explore the output of the intermediate layer and understand how the network encodes and transforms the input data. We can reveal useful patterns learned in the network, edge detection, color distribution, etc. by observing feature maps, gradient distributions, and dimensionality reduction visualizations. In this section, we'll explore what exactly the neural network has learned, using a convolutional neural network ( , Convolutional Neural Networks) CNNto classify a dataset containing Xand images, and examine the network layer outputs for activation results.O

1. Visualize the results of feature learning

Visualizing the results of feature learning has many applications. First of all, visualization can help us evaluate and adjust the design of the neural network. By observing the feature map and gradient distribution, we can judge whether the network has learned an effective feature representation, thereby optimizing the network structure and parameter settings; secondly, visualization can also help us explain the network. By observing the output of the middle layer, we can understand the response mode of the network to different categories or input samples, and explain the basis of its prediction; finally, by observing the feature representation learned by the network, we can learn from the ideas and design more Good manual features or feature extraction algorithms.

(1) In order to visualize the results of feature learning, we will use a dataset containing Xand images. The relevant datasets can be downloaded from the gitcode link . After the download is complete, decompress it. After decompression, you can see the images in the folder as follows:O

Image file list
The category of an image can be obtained from the name of the image, where the first character of the image name specifies the category to which the image belongs.

(2) Import the required library:

import torch
from torch import nn
from torch.utils.data import Dataset, DataLoader
from torch.optim import SGD, Adam
device = 'cuda' if torch.cuda.is_available() else 'cpu'
import numpy as np, cv2
import matplotlib.pyplot as plt
from glob import glob
from imgaug import augmenters as iaa

(3) Define the class of the acquired data, ensure that the image shape is resized 28 x 28, and the target class is converted into a numerical form.

Define the image augmentation method to reshape the image as 28 x 28:

tfm = iaa.Sequential(iaa.Resize(28))

Define a class that takes a folder path as input, and __init__iterate over the files in that path in the method:

class XO(Dataset):
    def __init__(self, folder):
        self.files = glob(folder)

Define __len__the method to return the length of the dataset:

    def __len__(self):
        return len(self.files)

The definition __getitem__method takes an index, returns the file present at that index, reads an image file and performs augmentation on the image. It is not used here collate_fnbecause the small dataset does not significantly affect the training time:

    def __getitem__(self, ix):
        f = self.files[ix]
        im = tfm.augment_image(cv2.imread(f)[:,:,0])

28 x 28Create the channel dimensions before the image shape (each image's shape is ):

        im = im[None]

Determine the category of each image based on the characters between the characters " /" and " " in the filename :@

        cl = f.split('/')[-1].split('@')[0] == 'x'

Finally, return the image and its corresponding category:

        return torch.tensor(1 - im/255).to(device).float(), torch.tensor([cl]).float().to(device)

(4) Display the image sample, extract the image and its corresponding class through the class defined above:

data = XO('images/*')

Plot an image sample from the obtained dataset:

R, C = 7,7
fig, ax = plt.subplots(R, C, figsize=(5,5))
for label_class, plot_row in enumerate(ax):
    for plot_cell in plot_row:
        plot_cell.grid(False); plot_cell.axis('off')
        ix = np.random.choice(1000)
        im, label = data[ix]
        plot_cell.imshow(im[0].cpu(), cmap='gray')
plt.tight_layout()
plt.show()

Sample visualization

(5) Define the model architecture, loss function and optimizer:

from torch.optim import SGD, Adam
def get_model():
    model = nn.Sequential(
        nn.Conv2d(1, 64, kernel_size=3),
        nn.MaxPool2d(2),
        nn.ReLU(),
        nn.Conv2d(64, 128, kernel_size=3),
        nn.MaxPool2d(2),
        nn.ReLU(),
        nn.Flatten(),
        nn.Linear(3200, 256),
        nn.ReLU(),
        nn.Linear(256, 1),
        nn.Sigmoid()
    ).to(device)

    loss_fn = nn.BCELoss()
    optimizer = Adam(model.parameters(), lr=1e-3)
    return model, loss_fn, optimizer

Since it is a binary classification problem, the binary cross-entropy loss ( nn.BCELoss()) is used here to print the model summary:

from torchsummary import summary
model, loss_fn, optimizer = get_model()
"""
summary(model, input_size=(1,28,28))
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 64, 26, 26]             640
         MaxPool2d-2           [-1, 64, 13, 13]               0
              ReLU-3           [-1, 64, 13, 13]               0
            Conv2d-4          [-1, 128, 11, 11]          73,856
         MaxPool2d-5            [-1, 128, 5, 5]               0
              ReLU-6            [-1, 128, 5, 5]               0
           Flatten-7                 [-1, 3200]               0
            Linear-8                  [-1, 256]         819,456
              ReLU-9                  [-1, 256]               0
           Linear-10                    [-1, 1]             257
          Sigmoid-11                    [-1, 1]               0
================================================================
Total params: 894,209
Trainable params: 894,209
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.69
Params size (MB): 3.41
Estimated Total Size (MB): 4.10
"""
----------------------------------------------------------------

(6) Define a function for batch training that takes an image and its class as input and returns its loss and accuracy after performing backpropagation on the given batch:

def train_batch(x, y, model, optimizer, loss_fn):
    prediction = model(x)
    batch_loss = loss_fn(prediction, y)
    batch_loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    return batch_loss.item()

def accuracy(x, y, model):
    with torch.no_grad():
        prediction = model(x)
    max_values, argmaxes = prediction.max(-1)
    is_correct = argmaxes == y
return is_correct.cpu().numpy().tolist()

@torch.no_grad()
def val_loss(x, y, model, loss_fn):
    prediction = model(x)
    val_loss = loss_fn(prediction, y)
    return val_loss.item()

(7) definition DataLoader, where the input is Datasetthe class:

trn_dl = DataLoader(data, batch_size=32, drop_last=True)

(8) Initialize and train the model:

model, loss_fn, optimizer = get_model()

for epoch in range(10):
    for ix, batch in enumerate(iter(trn_dl)):
        x, y = batch
        batch_loss = train_batch(x, y, model, optimizer, loss_fn)

(9) Get an image to view the image content learned by the filter:

im, c = trn_dl.dataset[2]
plt.imshow(im[0].cpu())
plt.show()

See what the filter has learned about the image

2. Visualize the output of the first convolutional layer

(1) Pass the image through the trained model and get the output of the first layer, and store it in intermiddle_outputthe variable:

print(list(model.children()))
[Conv2d(1, 64, kernel_size=(3, 3), stride=(1, 1)), MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), ReLU(), Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1)), MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), ReLU(), Flatten(start_dim=1, end_dim=-1), Linear(in_features=3200, out_features=256, bias=True), ReLU(), Linear(in_features=256, out_features=1, bias=True), Sigmoid()]

first_layer = nn.Sequential(*list(model.children())[:1])
intermediate_output = first_layer(im[None])[0].detach()

(2) Draw 64the output of filters, itermiddle_outputwhere each element corresponds to the convolution output of a filter:

n = 8
fig, ax = plt.subplots(n, n, figsize=(10,10))
for ix, axis in enumerate(ax.flat):
    axis.set_title('Filter: '+str(ix))
    axis.imshow(intermediate_output[ix].cpu())
plt.tight_layout()
plt.show()

convolution output
In the above output, it can be seen that some filters such as filters 0, 4, 6and 7, learned the edges present in the image.

(3) Input multiple Oimages and use the 4filter to perform convolution to observe the output results.

Get multiple Oimages from a dataset:

x, y = next(iter(trn_dl))
x2 = x[y==0]
print(len(x2))
# 15

Adjust x2the shape so that it can be used as the input of the convolutional neural network, that is 批大小 x 通道 x 高度 x 宽度:

x2 = x2.view(-1,1,28,28)

Define variables to store the model output:

first_layer = nn.Sequential(*list(model.children())[:1])

Extract the output of Oimage( x2) after the first layer( first_layer):

first_layer_output = first_layer(x2).detach()

(4) Draw the output after the image passes through the first layer:

n = 4
fig, ax = plt.subplots(n, n, figsize=(10,10))
for ix, axis in enumerate(ax.flat):
    if ix < n**2-1:
        axis.imshow(first_layer_output[ix,4,:,:].cpu())
        axis.set_title(str(ix))
plt.tight_layout()
plt.show()

Filter feature extraction effect
It can be seen that the behavior of a given filter is consistent across different images.

3. Visualize feature maps of different network layers

(1) Extract the output of the original Oimage from the input layer to the th convolutional layer, and draw the output after the filter in the th layer is convolved with the input image.22O

Plot the convolution output of the filter with the corresponding image:

second_layer = nn.Sequential(*list(model.children())[:4])
second_intermediate_output = second_layer(im[None])[0].detach()

print(second_intermediate_output.shape)
# torch.Size([128, 11, 11])
n = 11
fig, ax = plt.subplots(n, n, figsize=(10,10))
for ix, axis in enumerate(ax.flat):
    axis.imshow(second_intermediate_output[ix].cpu())
    axis.set_title(str(ix))
plt.tight_layout()
plt.show()

print(im.shape)
# torch.Size([1, 28, 28])

convolution output
Taking the output of the filter as an example, similar activations can be seen across images when 34we 34pass multiple images through the filter:O

second_layer = nn.Sequential(*list(model.children())[:4])
second_intermediate_output = second_layer(x2).detach()

print(second_intermediate_output.shape)
# torch.Size([15, 128, 11, 11])
n = 4
fig, ax = plt.subplots(n, n, figsize=(10,10))
for ix, axis in enumerate(ax.flat):
    if ix < n**2-1:
        axis.imshow(second_intermediate_output[ix,34,:,:].cpu())
        axis.set_title(str(ix))
plt.tight_layout()
plt.show()

print(len(data))
# 2498

Please add a picture description
(2) Plot the activations of the fully connected layers.

First, get an image sample:

custom_dl = DataLoader(data, batch_size=2498, drop_last=True)

Next, select images from the dataset O, reshape them, and pass them as input to CNNthe model:

x, y = next(iter(custom_dl))
x2 = x[y==0]
print(len(x2))
# 1245
x2 = x2.view(len(x2),1,28,28)

Pass the above image through CNNthe model to get the output of the fully connected layer:

flatten_layer = nn.Sequential(*list(model.children())[:7])
flatten_layer_output = flatten_layer(x2).detach()

print(flatten_layer_output.shape)
# torch.Size([1245, 3200])

Plot the fully connected layer output:

plt.figure(figsize=(100,10))
plt.imshow(flatten_layer_output.cpu())
plt.show()

Fully connected layer output
The shape of the output is , because there are images 1245 x 3200in the dataset , and the output for each image in the fully connected layer is dimension. When the input is , activation values ​​greater than zero in the fully connected layer are highlighted, shown as white pixels in the image. It can be seen that the model has learned the structural information in the images even though there are large style differences between the input images belonging to the same category.1,245O3,200O

summary

Visualization plays an important role in exploring the feature learning process inside neural networks. It provides us with an intuitive and interpretable way to understand the operation mechanism of the network and the learned feature representation. By visualizing the results of feature learning, we can gain insight into neural networks and provide references for network optimization, interpretation, and improvement. In this section, we deepen our understanding of network behavior and learned feature representations by showing how to visualize the output of intermediate layers of a neural network.

series link

PyTorch Deep Learning Combat (1) - Neural Network and Model Training Process Detailed
PyTorch Deep Learning Combat (2) - PyTorch Basics
PyTorch Deep Learning Combat (3) - Using PyTorch to Build a Neural Network
PyTorch Deep Learning Combat (4) - Commonly used activation functions and loss functions in detail
PyTorch deep learning practice (5) - computer vision basics
PyTorch deep learning practice (6) - neural network performance optimization technology
PyTorch deep learning practice (7) - the impact of batch size on neural network training
PyTorch deep learning combat (8) - batch normalization
PyTorch deep learning combat (9) - learning rate optimization
PyTorch deep learning combat (10) - overfitting and its solution
PyTorch deep learning combat (11) - Convolutional neural network
PyTorch deep learning practice (12) - data enhancement

Guess you like

Origin blog.csdn.net/LOVEmy134611/article/details/131785033