读gaitedge代码

首先是为了取消分布式训练重新配了环境(反正换垃圾笔记本跑了,也该从头再来了)
cuda11.1
python3.9
torch1.9.1
torchvision0.10.1
torchaudio0.9.1

其实是因为问了ChatGPT说是得重装torch,结果其实只需要把nccl那句改成下面这句就好(主要原因是win不支持nccl)

torch.distributed.init_process_group("gloo")

然后把baseline.yaml里的dataset_root设置成通过pretreatment预处理数据集后的输出,一般是数据集名-pkl

trainer_cfg:

'enable_float16': True, 
'with_test': True, 
'fix_BN': False, 
# 是否冻结BN层
'log_iter': 100, 
'restore_ckpt_strict': True, 
# 表示在恢复模型时是否强制执行严格的检查。这可以确保恢复的模型与训练时的一致性,但可能会增加恢复时间和内存占用
'optimizer_reset': False, 
'scheduler_reset': False, 
'restore_hint': 0, 
# 表示恢复模型时的提示步骤。这可以加速模型恢复过程,但需要手动确定适当的步骤数
'save_iter': 10000, 
'save_name': 'Baseline', 
'sync_BN': True, 
# 是否使用同步BN层
'total_iter': 60000, 
'sampler': {
    
    
	'batch_shuffle': True, 
	'batch_size': [2, 4], 
	'frames_num_fixed': 30, 
	'frames_num_max': 50, 
	'frames_num_min': 25, 
	'sample_type': 'fixed_unordered', 
		# all_ordered表示使用整个序列进行测试,并按其自然顺序输入序列
		# fixed_unordered表示使用固定数量的帧进行测试,并随机打乱它们的顺序
	# frames_all_limit=720表示为避免内存不足而限制采样的帧数
	# metric取euc(欧氏距离)或cos(余弦相似度)
	'type': 'TripletSampler'}, 
	'transform': [
		{
    
    'type': 'BaseSilCuttingTransform', 
		'img_w': 64}]}

loss_cfg:

loss_term_weight: 1.0
	# 损失函数项的权重,用于指定不同的损失函数贡献于总体损失函数的比例
	margin: 0.2
	# 三元组损失函数的边界参数。它定义了正样本和负样本之间的最小距离差异。如果距离差异小于边界,则损失函数值为0,否则为距离差异减去边界
	type: TripletLoss
	log_prefix: triplet
loss_term_weight: 0.1
	scale: 16
	# 交叉熵损失函数的比例尺参数
	type: CrossEntropyLoss
	log_prefix: softmax
	log_accuracy: true
	# 表示是否在日志文件中记录训练准确率

model_cfg:

'model': 'Baseline', 
'backbone_cfg': {
    
    
	'in_channels': 1, 
	'layers_cfg': ['BC-64', 'BC-64', 'M', 'BC-128', 'BC-128', 'M', 'BC-256', 'BC-256'], 'type': 'Plain'}, 
'SeparateFCs': {
    
    
	'in_channels': 256, 'out_channels': 256, 'parts_num': 31}, 
'SeparateBNNecks': {
    
    
	'class_num': 74, 'in_channels': 256, 'parts_num': 31}, 
'bin_num': [16, 8, 4, 2, 1]}
# 在模型的最后一层,输出向量将被分成多个不同的二进制数。这些数将用于度量两个向量之间的距离。bin_num指定二进制数的数量。在这里,有5个不同的二进制数,分别包含16、8、4、2和1个元素。

data_cfg:

{
    
    'dataset_name': 'CASIA-B', 
'dataset_root': '../datasets/CASIA-B-pkl', 
'num_workers': 1, 
'dataset_partition': '../datasets/CASIA-B/CASIA-B.json', 
# 1~74是训练集,75~124测试集
'remove_no_gallery': False, 
#是否删除没有匹配图库样本的探测样本。这可以确保测试集仅包含有意义的样本。
'cache': False, 
'test_dataset_name': 'CASIA-B'}


Train Pid List --------
[001, 002, ..., 074]
Test Pid List --------
[075, 076, ..., 124]
'lr': 0.1, 'momentum': 0.9, 'solver': 'SGD', 'weight_decay': 0.0005
'gamma': 0.1, 'milestones': [20000, 40000], 'scheduler': 'MultiStepLR'}
# 训练中将会更新学习率的步骤。
# 在这个例子中,训练开始时的初始学习率将持续到第一次更新的里程碑(20000个步骤),然后学习率将被降低,再在第二个里程碑(40000个步骤)降低一次。在这个例子中,只使用了两个里程碑来降低学习率,但可以根据需要添加更多的里程碑。
Parameters Count: 3.77914M

因为垃圾笔记本爆显存所以调了训练bs为2和4(类数和类内样本数)结果梯度爆炸了呵呵

baseline.yaml

main.py
首先是初始化cfg配置信息

msg_mgr = get_msg_mgr()

创建一个用于显示输出log信息的对象

__init__: 初始化MessageManager实例,创建一个有序字典info_dict,一个用于记录summary类型的列表writer_hparams,以及一个记录时间的变量time。
init_manager: 初始化MessageManager实例,创建一个记录器logger,一个TensorBoard写入器writer,并将它们添加到logger中,同时还会将log输出到文件。它还将iteration、log_iter和save_path设置为实例属性。
init_logger: 初始化记录器logger,可以选择是否将log输出到文件。
append: 将新的信息添加到info_dict中。
flush: 清空info_dict和TensorBoard写入器writer中的缓存。
write_to_tensorboard: 将summary写入TensorBoard中。
log_training_info: 输出训练信息和统计结果。
reset_time: 重置时间。
train_step: 训练一个步骤。
log_debug: 记录debug级别的日志信息。
log_info: 记录info级别的日志信息。
log_warning: 记录warning级别的日志信息。
Model = getattr(models, model_cfg['model'])
model = Model(cfgs, training)
# 使用getattr()获取models模块中对应的模型类。然后将读取的模型参数传入模型类实例化一个模型对象,最终将该对象返回。这个过程即为模型的初始化过程。
if training and cfgs['trainer_cfg']['sync_BN']:
   	model = nn.SyncBatchNorm.convert_sync_batchnorm(model)
   	# 判断是否在多GPU上同步BN
if cfgs['trainer_cfg']['fix_BN']:
   	model.fix_BN()
   	# 冻结BN
model = get_ddp_module(model)
# 将模型封装为一个分布式模型
msg_mgr.log_info(params_count(model))
msg_mgr.log_info("Model Initialization Finished!")


从训练loader中每次取出下面这玩意
在这里插入图片描述

ipts = model.inputs_pretreament(inputs)

经过一个预处理
在这里插入图片描述

模型:

Baseline(
  # 把上面的再拆成5个玩意
  # ipts是list包着的[4, 30, 64, 44]->sils[4, 1, 30, 64, 44]
  # labs是一维tensor=[63, 61, 63, 61](batch_size长)
  # seqL是None
  # 输入x[4, 1, 30, 64, 44]转置成[120, 1, 64, 44]输入Plain的feature
  (Backbone): SetBlockWrapper(
    (forward_block): Plain(
      (feature): Sequential(
        (0): BasicConv2d(
          (conv): Conv2d(1, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
        )
        # 得到[120, 64, 64, 44]
        (1): LeakyReLU(negative_slope=0.01, inplace=True)
        (2): BasicConv2d(
          (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        )
        (3): LeakyReLU(negative_slope=0.01, inplace=True)
        (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
        # torch.Size([120, 64, 32, 22])
        (5): BasicConv2d(
          (conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        )
        # torch.Size([120, 128, 32, 22])
        (6): LeakyReLU(negative_slope=0.01, inplace=True)
        (7): BasicConv2d(
          (conv): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        )
        (8): LeakyReLU(negative_slope=0.01, inplace=True)
        (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
        # torch.Size([120, 128, 16, 11])
        (10): BasicConv2d(
          (conv): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        )
        # torch.Size([120, 256, 16, 11])
        (11): LeakyReLU(negative_slope=0.01, inplace=True)
        (12): BasicConv2d(
          (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        )
        (13): LeakyReLU(negative_slope=0.01, inplace=True)
      )
    )
  )
  # 输出out[120, 256, 16, 11]再转置为[4, 256, 30, 16, 11]
  (TP): PackSequenceWrapper()
  # 最大池化得到x[4, 256, 16, 11]
  (HPP): HorizontalPoolingPyramid()
  		# 1
  		# 转置x为torch.Size([4, 256, 16, 11])
  		# 最大池化+均值池化[4, 256, 16]
  		# 2
  		# 转置x为torch.Size([4, 256, 8, 22])
  		# 最大池化+均值池化[4, 256, 8]
  		# 3
  		# 转置x为torch.Size([4, 256, 4, 44])
  		# 最大池化+均值池化[4, 256, 4]
  		# 4
  		# 转置x为torch.Size([4, 256, 2, 88])
  		# 最大池化+均值池化[4, 256, 2]
  		# 5
  		# 转置x为torch.Size([4, 256, 1, 176])
  		# 最大池化+均值池化[4, 256, 1]
  # 输出feat[4, 256, 31]
  (FCs): SeparateFCs()
  		# 转置为[31, 4, 256]后乘一个学习参数[31, 256, 256]得到[31, 4, 256]再转置为[4, 256, 31]
  (BNNecks): SeparateBNNecks(
    # 转置为[4, 7936]
    (bn1d): BatchNorm1d(7936, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    # 转置为[4, 256, 31]后再转置为feature[31, 4, 256],然后乘一个学习参数[31, 256, 74]得到logits[31, 4, 74]
    # 再分别转置得到feature[4, 256, 31],logits[4, 74, 31]
  )
 `
  (loss_aggregator): LossAggregator(
    (losses): ModuleDict(
      (triplet): TripletLoss()
      # 31维的损失loss通过损失函数得到
      # 还得到一个info
      # Odict([('loss', 31维), 
      #		   ('hard_loss', 31维), 
      #		   ('loss_num', 31维), 
      #		   ('mean_dist', 31维)
      (softmax): CrossEntropyLoss()
      loss = loss.mean() * loss_func.loss_term_weight
      # 这个权重,交叉熵是0.1,三元是1.0
      loss_sum += loss
    )
  )
)

backbone输出一个retval

{
    
    'training_feat': 
	{
    
    'triplet': 
		{
    
    'embeddings': [4, 256, 31], 		
		 'labels': tensor([63, 61, 63, 61])}, 
	 'softmax': 
	 	{
    
    'logits': [4, 256, 31], 
		 'labels': tensor([63, 61, 63, 61])}},
 'visual_summary': 
 	{
    
    'image/sils': [120, 1, 64, 44]},
 'inference_feat': 
 	{
    
    'embeddings': [4, 256, 31]}

phase1_rec.yaml

DDPPassthrough(
  # 把上面的再拆成5个玩意
  # ipts是list包着的[4, 30, 64, 64]->sils[4, 1, 30, 64, 64]
  # labs是一维tensor=[30, 30, 62, 62]
  # seqL是None
  # 输入x[4, 1, 30, 64, 64]
  (module): GaitGL(
    # 输入sils[4, 1, 30, 64, 64]
    (conv3d): Sequential(
      (0): BasicConv3d(
        (conv3d): Conv3d(1, 32, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      # 输出[4, 32, 30, 64, 64]
      (1): LeakyReLU(negative_slope=0.01, inplace=True)
    )
    (LTA): Sequential(
      (0): BasicConv3d(
        (conv3d): Conv3d(32, 32, kernel_size=(3, 1, 1), stride=(3, 1, 1), bias=False)
      )
      # 输出x[4, 32, 10, 64, 64]
      (1): LeakyReLU(negative_slope=0.01, inplace=True)
    )
    (GLConvA0): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(32, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      # 输出gob_feat[4, 64, 10, 64, 64]
      if self.halving == 0:# 这里是3所以不走这一路
	      (local_conv3d): BasicConv3d(
	        (conv3d): Conv3d(32, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
	      )
	  else:
	  	  # 对上面这个x[4, 32, 10, 64, 64] split得到lcl_feat
	  	  # 这是一个tuple,里面有8个[4, 32, 10, 8, 64]
	  	  # 8=int(x.size(3)//2**self.halving)
	  	  # lcl_feat中每个通过local_conv3d得到[4, 64, 10, 8, 64],再拼接成为lcl_feat[4, 64, 10, 64, 64]
	  	  feat = F.leaky_relu(gob_feat) + F.leaky_relu(lcl_feat)
	  	  # 即[4, 64, 10, 64, 64]
    )
    (MaxPool0): MaxPool3d(kernel_size=(1, 2, 2), stride=(1, 2, 2), padding=0, dilation=1, ceil_mode=False)
    # 得到[4, 64, 10, 32, 32]
    (GLConvA1): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(64, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      # 得到[4, 128, 10, 32, 32]
      # 类似上面先split,分别送入局部卷积
      (local_conv3d): BasicConv3d(
        (conv3d): Conv3d(64, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      # 再类似进行堆叠,并送入激活函数后求和,得到[4, 128, 10, 32, 32]
    )
    (GLConvB2): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      (local_conv3d): BasicConv3d(
        (conv3d): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      # 依然是类似的操作,得到[4, 128, 10, 64, 32]
    )
    (TP): PackSequenceWrapper()
    # 最大池化,得到x[4, 128, 64, 32]
    (HPP): GeMHPP()
    # 将x平均池化得到[4, 128, 64, 1],并压缩为x[4, 128, 64]  
    (Head0): SeparateFCs()
    # x permute成[64, 4, 128],乘一个学习参数[64, 128, 128],得到[64, 4, 128],再permute成gait[4, 128, 64]
    (Bn): SyncBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    # gait batchnorm得到embed=bnft[4, 128, 64]
    (Head1): SeparateFCs()
    # 对bnft做类似Head0的操作,乘一个学习参数[64, 128, 74],得到logi[4, 74, 64]
    (loss_aggregator): LossAggregator(
      (losses): ModuleDict(
        (triplet): TripletLoss()
        (softmax): CrossEntropyLoss()
      )
    )
  )
)
{
    
    'training_feat': 
	{
    
    'triplet': 
		{
    
    'embeddings': embed[4, 128, 64]
		 'labels': tensor([30, 30, 62, 62])}, 
	 'softmax': 
	 	{
    
    'logits': logi[4, 74, 64]
	 	 'labels': tensor([30, 30, 62, 62])}}, 
 'visual_summary': 
 	{
    
    'image/sils': sils的转置[120, 1, 64, 64]
 'inference_feat': 
 	{
    
    'embeddings': embed[4, 128, 64]}}

phase1_seg.yaml

不能直接跑,因为会遇到batch里取到空导致越界等问题
改动:(应该都是必要的吧)
data_in_use: [false, false, true, true]→data_in_use: [true, false, false, false]
sample_type: fixed_unordered→sample_type: fixed_ordered
加上本没有的frames_skip_num: 0
trainer_cfg和evaluator_cfg统一取一种transform
CASIA-B*,不能做dataset_name,因为*,不能写进win的路径中

DDPPassthrough(
  (module): Segmentation(
  # 5个输入
  # ipts是list包着俩
  	# [8, 30, 3, 128, 128]->rgbs[240, 3, 128, 128]
  	# [8, 30, 128, 128]->sils[240, 1, 128, 128]
  # labs是一维tensor,长128
  # typs是list,装着'nm-06', 'bg-01', 'cl-01'等,长128
  # vies是list,装着'072', '090', '180', '108'等,长128
  # seqL是None
    (Backbone): U_Net(
    # 输入x[240, 3, 128, 128](rgbs)
    # 此时选择模型冻结,不算梯度
      (Conv1): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )# x1[240, 16, 128, 128]
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      # [240, 16, 64, 64]   
      (Conv2): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )# x2[240, 32, 64, 64]
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      # [240, 32, 32, 32]   
      (Conv3): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )# x3[240, 64, 32, 32]
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      # [240, 64, 16, 16]   
      (Conv4): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )# x4[240, 128, 16, 16]
      
      # 冻结选择结束,以下永远计算梯度
      (Up4): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          # [240, 128, 32, 32]
          (1): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )# d4[240, 64, 32, 32]
      d4 = torch.cat((x3, d4), dim=1) 
      # d4[240, 128, 32, 32]
      (Up_conv4): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )# d4[240, 64, 32, 32]
      (Up3): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )# d3[240, 32, 64, 64]
      d3 = torch.cat((x2, d3), dim=1)
      # d3[240, 64, 64, 64]
      (Up_conv3): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )# d3[240, 32, 64, 64]
      (Up2): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )# d2[240, 16, 128, 128]
      d2 = torch.cat((x1, d2), dim=1)
      # d2[240, 32, 128, 128]
      (Up_conv2): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )# d2[240, 16, 128, 128]
      (Conv_1x1): Conv2d(16, 1, kernel_size=(1, 1), stride=(1, 1))
    )# d1[240, 1, 128, 128]
    (loss_aggregator): LossAggregator(
      (losses): ModuleDict(
        (bce): BinaryCrossEntropyLoss()
      )
    )
  )
)
{
    
    'training_feat': {
    
    
	'bce': {
    
    
		'logits': [240, 1, 128, 128], 
		'labels': [240, 1, 128, 128]}}, 
 'visual_summary': {
    
    
 	'image/sils': [240, 1, 128, 128], 
 	'image/logits': [240, 1, 128, 128],
 	'image/pred': [240, 1, 128, 128]}, 
 'inference_feat': {
    
    
 	'pred': [240, 1, 128, 128],
 	'mask': [240, 1, 128, 128]}}

phase2_e2e.yaml

model_cfg里缺一个kernel_size,照着phase2_gaitedge.yaml加了个kernel_size: 3
因为分割模型没训练出来,所以trainer_cfg里的restore_hint暂且改成0

DDPPassthrough(
  # 输入
  # ipts是list包着3个东西
  	# ratios[16, 30]
  	# [16, 30, 3, 128, 128]->rgbs[480, 3, 128, 128]
  	# [16, 30, 128, 128]->sils[480, 1, 128, 128]
  # labs是一维tensor,长16
  # seqL是None
  (module): GaitEdge(
    (Backbone): U_Net(
    # 输入[480, 3, 128, 128]
      (Conv1): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (Conv2): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (Conv3): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (Conv4): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Up4): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )
      (Up_conv4): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Up3): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )
      (Up_conv3): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Up2): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )
      (Up_conv2): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Conv_1x1): Conv2d(16, 1, kernel_size=(1, 1), stride=(1, 1))
    )
    # 输出logis[480, 1, 128, 128]
    # 经过sigmoid得到logits,再四舍五入得到掩码mask(其实这一步上面几套里也有)
    判断self.is_edge为false,再判断self.align为ture
	    (gait_align): GaitAlign(
	      # 输入logits,mask,ratios->w_h_ratio[480, 1]
	      # mask横向求和得到h_sum[480, 1, 128]
	      _ = (h_sum >= 1).float().cumsum(axis=-1)  # [480, 1, 128]
          h_top = (_ == 0).float().sum(-1)  # [480, 1]
          h_bot = (_ != torch.max(_, dim=-1, keepdim=True)
                 [0]).float().sum(-1) + 1.  # [480, 1]
          # mask纵向求和得到w_sum[480, 1, 128]
          w_cumsum = w_sum.cumsum(axis=-1)  # [480, 1, 128]
          w_h_sum = w_sum.sum(-1).unsqueeze(-1)  # [480, 1, 1]
          w_center = (w_cumsum < w_h_sum / 2.).float().sum(-1)  # [480, 1]

          p1 = self.W - self.H * w_h_ratio 
          # self.W=44,self.H=64
          p1 = p1 / 2.
          p1 = torch.clamp(p1, min=0)  # [n, c]
          t_w = w_h_ratio * self.H / w
          p2 = p1 / t_w  # [n, c]
          
          
	      (Pad): ZeroPad2d(padding=(22, 22, 0, 0), value=0.0)
	      # logits[480, 1, 128, 128]输入得到feature_map[480, 1, 128, 172]
	      w_left = w_center - width / 2 - p2  # [n, c]
          w_right = w_center + width / 2 + p2  # [n, c]

          w_left = torch.clamp(w_left, min=0., max=w+2*width_p)
          w_right = torch.clamp(w_right, min=0., max=w+2*width_p)

          boxes = torch.cat([w_left, h_top, w_right, h_bot], dim=-1)
          # index of bbox in batch
          box_index = torch.arange(n, device=feature_map.device)
          rois = torch.cat([box_index.view(-1, 1), boxes], -1)	# [480, 5]
	      (RoiPool): RoIAlign(output_size=(64, 44), spatial_scale=1, sampling_ratio=-1, aligned=False)
	      # 输入feature_map, rois得到[480, 1, 64, 44]->cropped_logits[16, 30, 64, 44]
	      # 和labs作为下面gaitgl的输入
	    )
    
    (conv3d): Sequential(
      # cropped_logits->sils[16, 1, 30, 64, 44]
      (0): BasicConv3d(
        (conv3d): Conv3d(1, 32, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      (1): LeakyReLU(negative_slope=0.01, inplace=True)
    )# [16, 32, 30, 64, 44]
    (LTA): Sequential(
      (0): BasicConv3d(
        (conv3d): Conv3d(32, 32, kernel_size=(3, 1, 1), stride=(3, 1, 1), bias=False)
      )
      (1): LeakyReLU(negative_slope=0.01, inplace=True)
    )# x[16, 32, 10, 64, 44]
    (GLConvA0): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(32, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      # gob_feat[16, 64, 10, 64, 44]
	      h = x.size(3)
	      split_size = int(h // 2**self.halving)
	      lcl_feat = x.split(split_size, 3)	# 8个[16, 32, 10, 8, 44]
		      (local_conv3d): BasicConv3d(
		        (conv3d): Conv3d(32, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
		      )# 逐个得到[16, 64, 10, 8, 44]后拼接
	   # 拼接得到lcl_feat[16, 64, 10, 64, 44]
	   feat = F.leaky_relu(gob_feat) + F.leaky_relu(lcl_feat)
    )# [16, 64, 10, 64, 44]
    (MaxPool0): MaxPool3d(kernel_size=(1, 2, 2), stride=(1, 2, 2), padding=0, dilation=1, ceil_mode=False)
    # [16, 64, 10, 32, 22]
    (GLConvA1): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(64, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      (local_conv3d): BasicConv3d(
        (conv3d): Conv3d(64, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
    )# [16, 128, 10, 32, 22]
    (GLConvB2): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      (local_conv3d): BasicConv3d(
        (conv3d): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
    )# [16, 128, 10, 64, 22]
    (TP): PackSequenceWrapper()
    # 最大池化得到[16, 128, 64, 22]
    (HPP): GeMHPP()
    # 平均池化得到[16, 128, 64]
    (Head0): SeparateFCs()
    	# ->[64, 16, 128]
    	# 乘张量[64, 128, 128]得到[64, 16, 128]->gait[16, 128, 64]
    (Bn): SyncBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (Head1): SeparateFCs()
    # 得到logi[16, 74, 64]



    (loss_aggregator): LossAggregator(
      (losses): ModuleDict(
        (triplet): TripletLoss()
        (bce): BinaryCrossEntropyLoss()
        (softmax): CrossEntropyLoss()
      )
    )
  )
)
{
    
    'training_feat': {
    
    
	'triplet': {
    
    
		'embeddings': [16, 128, 64], 
		'labels': tensor([24, 24, 24, 17, 17, 17,  2,  2, 24, 43, 43, 17,  2, 43,  2, 43])}, 
	'softmax': {
    
    
		'logits': [16, 74, 64], 
		'labels': tensor([24, 24, 24, 17, 17, 17,  2,  2, 24, 43, 43, 17,  2, 43,  2, 43])}
	'bce': {
    
    
		'logits': [480, 1, 128, 128], 
		'labels': [480, 1, 128, 128]}},
 'visual_summary': {
    
    
 	'image/sils': [480, 1, 64, 44],
 	'image/roi': [480, 1, 64, 44]}, 
 'inference_feat': {
    
    
 	'embeddings': [16, 128, 64]}}

phase2_gaitedge.yaml

其实就是phase2_e2e.yaml里把model_cfg的edge改成true

DDPPassthrough(
  # 输入
  # ipts是list包着3个东西
  	# ratios[16, 30]
  	# [16, 30, 3, 128, 128]->rgbs[480, 3, 128, 128]
  	# [16, 30, 128, 128]->sils[480, 1, 128, 128]
  # labs是一维tensor,长16
  # seqL是None
  (module): GaitEdge(
    (Backbone): U_Net(
    # 输入[480, 3, 128, 128]
      (Conv1): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (Conv2): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (Conv3): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (Conv4): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Up4): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )
      (Up_conv4): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Up3): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )
      (Up_conv3): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Up2): UpConv(
        (up): Sequential(
          (0): Upsample(scale_factor=2.0, mode=nearest)
          (1): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (2): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (3): ReLU(inplace=True)
        )
      )
      (Up_conv2): ConvBlock(
        (conv): Sequential(
          (0): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (1): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (2): ReLU(inplace=True)
          (3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
          (4): SyncBatchNorm(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
          (5): ReLU(inplace=True)
        )
      )
      (Conv_1x1): Conv2d(16, 1, kernel_size=(1, 1), stride=(1, 1))
    )
    # 输出logis[480, 1, 128, 128]
    # 经过sigmoid得到logits,再四舍五入得到掩码mask(其实这一步上面几套里也有)
    判断self.is_edge为true,对dils做如下预处理
    	dilated_mask = (morph.dilation(sils, self.kernel.to(sils.device)).detach()) > 0.5  
    	# 形态学图像处理中的膨胀操作 [480, 1, 128, 128]
        eroded_mask = (morph.erosion(sils, self.kernel.to(sils.device)).detach()) > 0.5   
        # 形态学图像处理中的腐蚀操作 [480, 1, 128, 128]
        edge_mask = dilated_mask ^ eroded_mask
        # [480, 1, 128, 128]


        new_logits = edge_mask*logits+eroded_mask*sils
        
        
	    判断self.align为ture
		    (gait_align): GaitAlign(
		      # 输入new_logits, sils, ratios->w_h_ratio[480, 1]
		      # mask横向求和得到h_sum[480, 1, 128]
		      _ = (h_sum >= 1).float().cumsum(axis=-1)  # [480, 1, 128]
		      '''首先得到(h_sum >= 1).float():
		      [[0., 0., 0.,  ..., 0., 0., 0.]],
		        [[0., 0., 0.,  ..., 0., 0., 0.]],
		        [[0., 0., 0.,  ..., 0., 0., 0.]],
		        ...,
		        [[0., 0., 0.,  ..., 0., 0., 0.]],
		        [[0., 0., 0.,  ..., 1., 0., 0.]],
		        [[0., 0., 0.,  ..., 1., 1., 1.]]], device='cuda:0')
		        再逐行求和(但是得到的一行求和要再重复n行)就是第i行就是前i行求和,累计求和
		        [[  0.,   0.,   0.,  ..., 101., 101., 101.]],
		        [[  0.,   0.,   0.,  ..., 101., 101., 101.]],
		        [[  0.,   0.,   0.,  ..., 105., 105., 105.]],
		        ...,
		        [[  0.,   0.,   0.,  ..., 114., 114., 114.]],
		        [[  0.,   0.,   0.,  ..., 119., 119., 119.]],
		        [[  0.,   0.,   0.,  ..., 119., 120., 121.]]])
		        '''

	          h_top = (_ == 0).float().sum(-1)  # [480, 1]
	          h_bot = (_ != torch.max(_, dim=-1, keepdim=True)
	                 [0]).float().sum(-1) + 1.  # [480, 1]
	          # mask纵向求和得到w_sum[480, 1, 128]
	          w_cumsum = w_sum.cumsum(axis=-1)  # [480, 1, 128]
	          w_h_sum = w_sum.sum(-1).unsqueeze(-1)  # [480, 1, 1]
	          w_center = (w_cumsum < w_h_sum / 2.).float().sum(-1)  # [480, 1]
	
	          p1 = self.W - self.H * w_h_ratio 
	          # self.W=44,self.H=64
	          p1 = p1 / 2.
	          p1 = torch.clamp(p1, min=0)  # [n, c]
	          # clamp把上下限压缩一下
	          t_w = w_h_ratio * self.H / w
	          p2 = p1 / t_w  # [n, c]
	          
	          height = h_bot - h_top  # [n, c]
		      width = height * w / h  # [n, c]
		      width_p = int(self.W / 2)# 22
	          
		      (Pad): ZeroPad2d(padding=(22, 22, 0, 0), value=0.0)
		      # logits[480, 1, 128, 128]输入得到feature_map[480, 1, 128, 172]
		      #可视化后像是被横向拉窄了而不是补充
		      w_left = w_center - width / 2 - p2  # [n, c]
	          w_right = w_center + width / 2 + p2  # [n, c]
	
	          w_left = torch.clamp(w_left, min=0., max=w+2*width_p)
	          w_right = torch.clamp(w_right, min=0., max=w+2*width_p)
	
	          boxes = torch.cat([w_left, h_top, w_right, h_bot], dim=-1)
	          # index of bbox in batch
	          box_index = torch.arange(n, device=feature_map.device)
	          rois = torch.cat([box_index.view(-1, 1), boxes], -1)	# [480, 5]
		      (RoiPool): RoIAlign(output_size=(64, 44), spatial_scale=1, sampling_ratio=-1, aligned=False)
		      # 输入feature_map, rois得到[480, 1, 64, 44]->cropped_logits[16, 30, 64, 44]
		      # 和labs作为下面gaitgl的输入
		    )
    
    (conv3d): Sequential(
      # cropped_logits->sils[16, 1, 30, 64, 44]
      (0): BasicConv3d(
        (conv3d): Conv3d(1, 32, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      (1): LeakyReLU(negative_slope=0.01, inplace=True)
    )# [16, 32, 30, 64, 44]
    (LTA): Sequential(
      (0): BasicConv3d(
        (conv3d): Conv3d(32, 32, kernel_size=(3, 1, 1), stride=(3, 1, 1), bias=False)
      )
      (1): LeakyReLU(negative_slope=0.01, inplace=True)
    )# x[16, 32, 10, 64, 44]
    (GLConvA0): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(32, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      # gob_feat[16, 64, 10, 64, 44]
	      h = x.size(3)
	      split_size = int(h // 2**self.halving)
	      lcl_feat = x.split(split_size, 3)	# 8个[16, 32, 10, 8, 44]
		      (local_conv3d): BasicConv3d(
		        (conv3d): Conv3d(32, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
		      )# 逐个得到[16, 64, 10, 8, 44]后拼接
	   # 拼接得到lcl_feat[16, 64, 10, 64, 44]
	   feat = F.leaky_relu(gob_feat) + F.leaky_relu(lcl_feat)
    )# [16, 64, 10, 64, 44]
    (MaxPool0): MaxPool3d(kernel_size=(1, 2, 2), stride=(1, 2, 2), padding=0, dilation=1, ceil_mode=False)
    # [16, 64, 10, 32, 22]
    (GLConvA1): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(64, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      (local_conv3d): BasicConv3d(
        (conv3d): Conv3d(64, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
    )# [16, 128, 10, 32, 22]
    (GLConvB2): GLConv(
      (global_conv3d): BasicConv3d(
        (conv3d): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
      (local_conv3d): BasicConv3d(
        (conv3d): Conv3d(128, 128, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1), bias=False)
      )
    )# [16, 128, 10, 64, 22]
    (TP): PackSequenceWrapper()
    # 最大池化得到[16, 128, 64, 22]
    (HPP): GeMHPP()
    # 平均池化得到[16, 128, 64]
    (Head0): SeparateFCs()
    	# ->[64, 16, 128]
    	# 乘张量[64, 128, 128]得到[64, 16, 128]->gait[16, 128, 64]
    (Bn): SyncBatchNorm(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (Head1): SeparateFCs()
    # 得到logi[16, 74, 64]



    (loss_aggregator): LossAggregator(
      (losses): ModuleDict(
        (triplet): TripletLoss()
        (bce): BinaryCrossEntropyLoss()
        (softmax): CrossEntropyLoss()
      )
    )
  )
)
{
    
    'training_feat': {
    
    
	'triplet': {
    
    
		'embeddings': [16, 128, 64], 
		'labels': tensor([24, 24, 24, 17, 17, 17,  2,  2, 24, 43, 43, 17,  2, 43,  2, 43])}, 
	'softmax': {
    
    
		'logits': [16, 74, 64], 
		'labels': tensor([24, 24, 24, 17, 17, 17,  2,  2, 24, 43, 43, 17,  2, 43,  2, 43])}
	'bce': {
    
    
		'logits': [480, 1, 128, 128], 
		'labels': [480, 1, 128, 128]}},
 'visual_summary': {
    
    
 	'image/sils': [480, 1, 64, 44],
 	'image/roi': [480, 1, 64, 44]}, 
 'inference_feat': {
    
    
 	'embeddings': [16, 128, 64]}}

猜你喜欢

转载自blog.csdn.net/weixin_40459958/article/details/129700445