UnboundLocalError: local variable ‘loss’ referenced before assignment解决方法

Traceback (most recent call last):
  File "src/main.py", line 442, in <module>
    main(args)
  File "src/main.py", line 404, in main
    args.clip_max_norm, args)
  File "/home/wsx/0A_DATA/HFPN/src/engine.py", line 52, in train_one_epoch
    losses = sum(loss_dict[k] * weight_dict[k] for k in loss_dict.keys() if k in weight_dict)


UnboundLocalError: local variable 'loss_dict' referenced before assignment
Killing subprocess 21108


原因:分布式同时多任务训练导致显存爆了导致。

解决:改小batchsize,更换的ddp,降一下显存,处理一下数据传入。

另外可在报错语句前面加入以下进行预警。

        try:
            ......
            ......
             
        except RuntimeError as e:
            if "out of memory" in str(e):
                sys.exit('Out Of Memory')
            else:
                raise e

猜你喜欢

转载自blog.csdn.net/qq_35831906/article/details/124563790