Let's implement the "layer" that builds the neural network as a class. The "layer" mentioned here is the unit of function in the neural network.
Let's start with some simple layers
Implementation of the multiplication layer
There are two common methods (interfaces) forward() and backward() in the implementation of the layer .
forward() corresponds to forward propagation
backward() corresponds to backpropagation
Now to implement the multiplication layer. Look at the code below
class MulLayer:
def __init__(self):
self.x = None
self.y = None
def forward(self, x, y):
self.x = x
self.y = y
out = x * y
return out
def backward(self, dout):
dx = dout*self.y # 翻转x和y
dy = dout*self.x
return dx,dy
backward() multiplies the derivative (dout) passed from the upstream by the flip value of the forward propagation, and then passes it downstream.
The following is an example of using MulLayer to realize the previous purchase (2 apples and consumption tax). Look at the picture:
Through the multiplication layer, the forward propagation of the above figure can be implemented as follows:
apple = 100
apple_num = 2
tax = 1.1
# layer
mul_apple_layer = MulLayer()
mul_tax_layer = MulLayer()
#forward
apple_price = mul_apple_layer.forward(apple, apple_num)
price = mul_tax_layer.forward(apple_price ,tax)
print(price)
The derivative of each variable can be obtained by backward()
#backward
dprice = 1
dapple_price, dtax = mul_tax_layer.backward(dprice)
dapple, dapple_num = mul_apple_layer.backward(dapple_price)
print(dapple, dapple_num, dtax)
In addition, it should be noted that "the derivative of the output variable in the forward propagation" needs to be entered in the parameter of backward().
Implementation of Addition Layer
Let's implement the addition layer of the addition node
class AddLayer:
def __init__(self):
pass
def forward(self, x, y):
out = x+y
return out
def backward(self, dout):
dx = dout*1
dy = dout*1
return dx,dy
The following uses the addition layer and the multiplication layer to realize the example in the figure below:
The implementation code is as follows:
apple = 100
apple_num = 2
orange = 150
orange_num = 3
tax = 1.1
# layer
mul_apple_layer = MulLayer()
mul_orange_layer = MulLayer()
add_apple_orange_layer = AddLayer()
mul_tax_layer = MulLayer()
# forward
apple_price = mul_apple_layer.forward(apple, apple_num)
orange_price = mul_orange_layer.forward(orange, orange_num)
all_price = add_apple_orange_layer.forward(apple_price, orange_price)
price = mul_tax_layer.forward(all_price, tax)
#backward
dprice = 1
dall_price, dtax = mul_tax_layer.backward(dprice)
dapple_price, dorange_price = add_apple_orange_layer.backward(dall_price)
dorange, dorange_num = nul_orange_layer.backward(dorange_price)
dapple, dapple_num = mul_apple_layer.backward(dapple_price)
print(price)
print(dapple_num, dapple, dorange, dorange_num, dtax)
In summary, the implementation of the layers in the calculation graph is very simple, and complex derivative calculations can be performed using these layers .
Implementation of the activation function layer
Here the layers that make up the neural network are implemented as a class. First implement the ReLU layer and the Sigmoid layer of the activation function.
Now to implement the ReLU layer. In the implementation of the layers of the neural network, it is generally assumed that the parameters of forward() and tacklerd0 are NumPy arrays . code show as below:
class Relu:
def __init__(self):
self.mask = None
def forward(self, x):
self.mask = (x <= 0)
out = x.copy()
out[self.mask] = 0
return out
def backward(self, dout):
dout[self.mask] = 0
dx = dout
return dx
Relu class has instance variable mask. This variable mask is a NumPy array composed of True/False. It will save the elements less than or equal to 0 in the input x element during forward propagation as True , and save other places (elements greater than 0) as False . As shown in the example below, the mask variable holds a NumPy array of True/False
>>> x= np.array( [[1.0,-0.5],[-2.0,3.0]] )
>>>print(x)
[[1. -0.5]
[-2、 3.]]
>>>mask=(x<=0)
>>>print(mask)
[[False True]
[True False]]
If the input value during forward propagation is less than or equal to 0, the value of backpropagation is 0, and
the mask saved during forward propagation will be used in backpropagation, and
the element in the mask of dout transmitted from the upstream will be True. place is set to 0 . (key ideas)
Sigmoid layer
The calculation graph is expressed as follows:
The calculation graph of the Sigmoid layer including backpropagation is as follows:
The above can be simplified to
In this way, by intensifying the nodes, you don't need to care about the trivial details in the Sigmoid layer, but only need to focus on its input and output
Alternatively, this result can be further organized as follows:
The following uses Python to implement the Sigmoid layer, the code is as follows:
class Sigmoid:
def __init__(self):
self.out = None
def forward(self,x):
out = 1 / (1+np.exp(-x))
self.out = out
return out
def backward(self, dout):
dx = dout * (1.0 - self.out)*self.out
return dx