Recently, there are always problems in training. After listening to suggestions, I introduced tensorboard in the pytorch environment.
1. Install tensorboardX
tensorboardX is installed on the premise of tensorboard, so we need to install it firsttensorboard
pip install tensorboard
pip install tensorboardX
2. Easy to use
The longest we have here is to make a use of the loss convergence graph
2.1 Import the corresponding module for initialization
# 导入可视化模块(ll添加)
from tensorboardX import SummaryWriter
writer = SummaryWriter('./result_tensorboard')
Adding a visualization graph to our training method
if i % 10 == 0:
print(
f"Train epoch {epoch}: ["
f"{i * len(d)}/{len(train_dataloader.dataset)}"
f" ({100. * i / len(train_dataloader):.0f}%)]"
f'\tLoss: {out_criterion["loss"].item():.3f} |'
f'\tMSE loss: {255*255*out_criterion["mse_loss"].item():.3f} |'
f'\tBpp loss: {out_criterion["bpp_loss"].item():.2f} |'
f"\tAux loss: {aux_loss.item():.2f}"
)
writer.add_scalar('Loss',out_criterion["loss"].item(),epoch)
writer.add_scalar('MSE loss',255*255*out_criterion["mse_loss"].item(),epoch)
writer.add_scalar('Bpp loss',out_criterion["bpp_loss"].item(),epoch)
writer.add_scalar('Aux loss',aux_loss.item(),epoch)
view page instructions
tensorboard --logdir=/xxxx/xxx
You can see the corresponding curve as the training progresses
3. Other uses
In addition, tensorboardX can also perform feature visualization, network model visualization, etc., which is being explored...
4. Problems encountered
4.1 tensorboard command not found
tensorboard未找到命令
When installing for the first time, it can be successfully started according to the above process, but after running the current task, it is found that changing the code to run will report an error.
Solution
pip install tensorflow
tensorboard is based on tensorflow, first we should install tensorflow
ps: The command about startup can customize the port
tensorboard --logdir=/xxx --port 6007