U2-Net显著性检测源码详解

        如图所示,U2 -Net是一个两级嵌套的u型结构。它的顶层是一个由11个阶段组成的大u型结构(图5中的立方体)。每个阶段都由一个配置良好的剩余Ublock(RSU)(底层U型结构)填充。因此,嵌套的u型结构能够更有效地提取阶段内多尺度特征,并更有效地聚合阶段间多级特征。

 1.ecoder部分

        RSU结构:(以RSU7为例)

        下采样部分:

        对于输入的特征图x,首先经过 3*3的卷积、BatchNormalization、RELU组成的卷积模块提取特征(绿色部分),然后经过5层的由 3*3的卷积、BatchNormalization、RELU卷积和maxpool层组成的模块进行特征提取,下采样,(蓝色模块)最终,输入图片大小由288,288变为9*9。此时由于特征图太小,采用空洞卷积。(白色模块)

        上采样部分:

        与u-net一样,每一层上采样部分由左侧同一层级和右侧上一层级经过上采样进行特征融合,然后经过3*3的卷积而来,最终经过一整个rsu模块,特征图维度保持不变

RSU7(
  (rebnconvin): REBNCONV(
    (conv_s1): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv1): REBNCONV(
    (conv_s1): Conv2d(64, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
  (rebnconv2): REBNCONV(
    (conv_s1): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
  (rebnconv3): REBNCONV(
    (conv_s1): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
  (rebnconv4): REBNCONV(
    (conv_s1): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (pool4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
  (rebnconv5): REBNCONV(
    (conv_s1): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (pool5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=True)
  (rebnconv6): REBNCONV(
    (conv_s1): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv7): REBNCONV(
    (conv_s1): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv6d): REBNCONV(
    (conv_s1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv5d): REBNCONV(
    (conv_s1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv4d): REBNCONV(
    (conv_s1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv3d): REBNCONV(
    (conv_s1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv2d): REBNCONV(
    (conv_s1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
  (rebnconv1d): REBNCONV(
    (conv_s1): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (bn_s1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu_s1): ReLU(inplace=True)
  )
)

        ECODER: 

        由上至下:经过每一个RSU模块后,会经过一层max_pool层,特征图大小减半,外层也是一个u-net结构。对于rsu模块,由于整体的特征图大小的缩小,RSU的下采样模块逐层递减。并在E_5和E_6加入更大扩张率的空洞卷积。

    decoder:

        与u-net一样,将上一层的输出结果上采样,与U形结构同级输出结果进行拼接,输入到RSU模型进行特征提取,RSU模块与U形结构左侧相同。同时,保留decoder每一层的输出结果,做上采样到288.288大小,与最后一层输出结果拼接,最后在融合的特征图上进行预测

        自监督训练

        对上采样到288的每一层输出结果,经过sigmoid操作,每一层输出结果均计算交叉熵损失。

        

        

        

猜你喜欢

转载自blog.csdn.net/qq_52053775/article/details/126854648