YOLO 入门合集（YOLO学习笔记）

一、熟悉YOLO官网

二、Modes

模块介绍

1、Train（训练）

官网定义

示例代码

Train Settings （训练相关的参数，下方有中文对照）

2、Val（验证）

一、熟悉YOLO官网

YOLO官网链接

有兴趣的可以自行浏览，主要部分是Modes和Datasets中的内容。（这里不建议使用官网自带的翻译，官网自带翻译十分混乱，可以自行百度翻译）

二、Modes

模块介绍

找到小标题 Modes 点击进入，在页面中下拉找到相关的介绍，这段英文以及左侧的列表就代表了YOLO的几大功能模块。

训练模式：在自定义或预加载的数据集上微调您的模型。（训练模型）
Val模式：用于验证模型性能的训练后检查点。（验证模型）
预测模式：释放模型对真实世界数据的预测能力。（运用模型进行预测）
导出模式：以各种格式准备好模型部署。（模型文件格式转换）
跟踪模式：将对象检测模型扩展到实时跟踪应用程序中。（实时预测，多用于视频逐帧分析）
基准测试模式：在不同的部署环境中分析模型的速度和准确性。（字面意思）

1、Train（训练）

官网定义

训练深度学习模型包括向其提供数据并调整其参数，以便其能够做出准确的预测。Ultralytics YOLOv8中的训练模式旨在充分利用现代硬件功能，对目标检测模型进行有效和高效的训练。本指南旨在涵盖使用YOLOv8强大的一组功能开始训练自己的模型所需的所有细节。

简单来说就是这个模块是用来训练自己的模型文件。

在进行代码讲解之前，首先要知道yolo的.pt文件和.yaml文件代表什么以及文件格式和内容。以及数据集的创建与格式。

详情可以查看我的这篇文章：YOLO数据集的创建教程，包括数据标注 (YOLO学习笔记)

示例代码

下拉找到Usage Examples,这里开始就是示例代码。

这段代码就是最简单的实例（epochs=100代表训练的轮次是100轮，imgsz=640代表图像的尺寸是640*640）

from ultralytics import YOLO

# Load a model
model = YOLO("yolov8n.yaml")  # build a new model from YAML
model = YOLO("yolov8n.pt")  # load a pretrained model (recommended for training)
model = YOLO("yolov8n.yaml").load("yolov8n.pt")  # build from YAML and transfer weights

# Train the model
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

页面下方中还有关于多GPU以及M1M2芯片进行训练的示例代码，这里就不作过多赘述了。

继续下拉找到 Resuming Interrupted Trainings

这部分讲述的是训练中断停止后可以接着之前的预训练模型继续训练，但前提是要先对之前的训练模型进行中断和保存的参数配置。

from ultralytics import YOLO

# Load a model
model = YOLO("path/to/last.pt")  # load a partially trained model

# Resume training
results = model.train(resume=True)

Train Settings （训练相关的参数，下方有中文对照）

Argument	Default	Description
`model`	`None`	Specifies the model file for training. Accepts a path to either a `.pt` pretrained model or a `.yaml` configuration file. Essential for defining the model structure or initializing weights.
`data`	`None`	Path to the dataset configuration file (e.g., `coco8.yaml`). This file contains dataset-specific parameters, including paths to training and validation data, class names, and number of classes.
`epochs`	`100`	Total number of training epochs. Each epoch represents a full pass over the entire dataset. Adjusting this value can affect training duration and model performance.
`time`	`None`	Maximum training time in hours. If set, this overrides the `epochs` argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.
`patience`	`100`	Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.
`batch`	`16`	Batch size, with three modes: set as an integer (e.g., `batch=16`), auto mode for 60% GPU memory utilization (`batch=-1`), or auto mode with specified utilization fraction (`batch=0.70`).
`imgsz`	`640`	Target image size for training. All images are resized to this dimension before being fed into the model. Affects model accuracy and computational complexity.
`save`	`True`	Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.
`save_period`	`-1`	Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.
`cache`	`False`	Enables caching of dataset images in memory (`True`/`ram`), on disk (`disk`), or disables it (`False`). Improves training speed by reducing disk I/O at the cost of increased memory usage.
`device`	`None`	Specifies the computational device(s) for training: a single GPU (`device=0`), multiple GPUs (`device=0,1`), CPU (`device=cpu`), or MPS for Apple silicon (`device=mps`).
`workers`	`8`	Number of worker threads for data loading (per `RANK` if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.
`project`	`None`	Name of the project directory where training outputs are saved. Allows for organized storage of different experiments.
`name`	`None`	Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.
`exist_ok`	`False`	If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.
`pretrained`	`True`	Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.
`optimizer`	`'auto'`	Choice of optimizer for training. Options include `SGD`, `Adam`, `AdamW`, `NAdam`, `RAdam`, `RMSProp` etc., or `auto` for automatic selection based on model configuration. Affects convergence speed and stability.
`verbose`	`False`	Enables verbose output during training, providing detailed logs and progress updates. Useful for debugging and closely monitoring the training process.
`seed`	`0`	Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.
`deterministic`	`True`	Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.
`single_cls`	`False`	Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.
`rect`	`False`	Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.
`cos_lr`	`False`	Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.
`close_mosaic`	`10`	Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.
`resume`	`False`	Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.
`amp`	`True`	Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy.
`fraction`	`1.0`	Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited.
`profile`	`False`	Enables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment.
`freeze`	`None`	Freezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning.
`lr0`	`0.01`	Initial learning rate (i.e. `SGD=1E-2`, `Adam=1E-3`) . Adjusting this value is crucial for the optimization process, influencing how rapidly model weights are updated.
`lrf`	`0.01`	Final learning rate as a fraction of the initial rate = (`lr0 * lrf`), used in conjunction with schedulers to adjust the learning rate over time.
`momentum`	`0.937`	Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update.
`weight_decay`	`0.0005`	L2 regularization term, penalizing large weights to prevent overfitting.
`warmup_epochs`	`3.0`	Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on.
`warmup_momentum`	`0.8`	Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period.
`warmup_bias_lr`	`0.1`	Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs.
`box`	`7.5`	Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates.
`cls`	`0.5`	Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components.
`dfl`	`1.5`	Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification.
`pose`	`12.0`	Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints.
`kobj`	`2.0`	Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy.
`label_smoothing`	`0.0`	Applies label smoothing, softening hard labels to a mix of the target label and a uniform distribution over labels, can improve generalization.
`nbs`	`64`	Nominal batch size for normalization of loss.
`overlap_mask`	`True`	Determines whether segmentation masks should overlap during training, applicable in instance segmentation tasks.
`mask_ratio`	`4`	Downsample ratio for segmentation masks, affecting the resolution of masks used during training.
`dropout`	`0.0`	Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training.
`val`	`True`	Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset.
`plots`	`False`	Generates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression.

中文对照

参数	默认值	描述
model	None	指定用于训练的模型文件。接受.pt 预训练模型或.yaml 配置文件的路径。对于定义模型结构或初始化权重至关重要。
data	None	数据集配置文件的路径（例如，coco8.yaml）。此文件包含特定于数据集的参数，包括训练和验证数据的路径、类名和类的数量。
epochs	100	训练的总轮数。每一轮代表对整个数据集的完整遍历。调整此值会影响训练持续时间和模型性能。
time	None	最大训练时间（小时）。如果设置，将覆盖 epochs 参数，在指定持续时间后自动停止训练。适用于时间受限的训练场景。
patience	100	在验证指标没有改进的情况下等待的轮数，然后提前停止训练。有助于通过在性能平稳时停止训练来防止过拟合。
batch	16	批大小，有三种模式：设置为整数（例如，batch=16），自动模式以实现 60％的 GPU 内存利用率（batch=-1），或具有指定利用率分数的自动模式（batch=0.70）。
imgsz	640	训练的目标图像大小。所有图像在输入模型之前都会调整为此尺寸。影响模型精度和计算复杂性。
save	True	启用训练检查点和最终模型权重的保存。对于恢复训练或模型部署很有用。
save_period	-1	以轮数指定的保存模型检查点的频率。值为 -1 则禁用此功能。对于长时间的训练会话中保存临时模型很有用。
cache	False	启用数据集图像在内存（True/ram）、磁盘（disk）中的缓存或禁用（False）。通过减少磁盘 I/O 来提高训练速度，但会增加内存使用。
device	None	指定用于训练的计算设备：单个 GPU（device=0）、多个 GPU（device=0,1）、CPU（device=cpu）或苹果硅的 MPS（device=mps）。
workers	8	数据加载的工作线程数（如果是多 GPU 训练，则为每个 RANK）。影响数据预处理和输入模型的速度，在多 GPU 设置中特别有用。
project	None	训练输出保存的项目目录名称。允许对不同的实验进行有组织的存储。
name	None	训练运行的名称。用于在项目文件夹内创建子目录，其中存储训练日志和输出。
exist_ok	False	如果为 True，允许覆盖现有的项目/名称目录。对于迭代实验而无需手动清除以前的输出很有用。
pretrained	True	确定是否从预训练模型开始训练。可以是布尔值或加载权重的特定模型的字符串路径。提高训练效率和模型性能。
optimizer	'auto'	训练的优化器选择。选项包括 SGD、Adam、AdamW、NAdam、RAdam、RMSProp 等，或 auto 以根据模型配置自动选择。影响收敛速度和稳定性。
verbose	False	在训练期间启用详细输出，提供详细的日志和进度更新。对于调试和密切监视训练过程很有用。
seed	0	为训练设置随机种子，确保具有相同配置的运行结果可重现。
deterministic	True	强制使用确定性算法，确保可重现性，但由于对非确定性算法的限制，可能会影响性能和速度。
single_cls	False	在训练期间将多类数据集中的所有类视为单个类。对于二进制分类任务或当关注对象存在而不是分类时很有用。
rect	False	启用矩形训练，优化批组成以实现最小填充。可以提高效率和速度，但可能影响模型精度。
cos_lr	False	利用余弦学习率调度器，在轮数上根据余弦曲线调整学习率。有助于管理学习率以实现更好的收敛。
close_mosaic	10	在最后的 N 个轮数中禁用马赛克数据增强以在完成前稳定训练。设置为 0 则禁用此功能。
resume	False	从上次保存的检查点恢复训练。自动加载模型权重、优化器状态和轮数，无缝继续训练。
amp	True	启用自动混合精度（AMP）训练，减少内存使用并可能在对精度影响最小的情况下加快训练。
fraction	1.0	指定用于训练的数据集的比例。允许在完整数据集的子集上进行训练，对于实验或资源有限时很有用。
profile	False	在训练期间启用 ONNX 和 TensorRT 速度的分析，对于优化模型部署很有用。
freeze	None	冻结模型的前 N 层或通过索引指定的层，减少可训练参数的数量。对于微调或迁移学习很有用。
lr0	0.01	初始学习率（即 SGD=1E-2，Adam=1E-3）。调整此值对于优化过程至关重要，影响模型权重更新的速度。
lrf	0.01	最终学习率作为初始率的分数 = （lr0 * lrf），与调度器结合使用以随时间调整学习率。
momentum	0.937	SGD 的动量因子或 Adam 优化器的 beta1，影响过去梯度在当前更新中的纳入。
weight_decay	0.0005	L2 正则化项，惩罚大的权重以防止过拟合。
warmup_epochs	3.0	学习率预热的轮数，从低值逐渐增加到初始学习率以在早期稳定训练。
warmup_momentum	0.8	预热阶段的初始动量，在预热期间逐渐调整到设置的动量。
warmup_bias_lr	0.1	预热阶段偏置参数的学习率，有助于在初始轮数中稳定模型训练。
box	7.5	损失函数中框损失组件的权重，影响对准确预测边界框坐标的重视程度。
cls	0.5	总损失函数中分类损失的权重，相对于其他组件影响正确类预测的重要性。
dfl	1.5	分布焦点损失的权重，在某些 YOLO 版本中用于细粒度分类。
pose	12.0	用于姿势估计模型中姿势损失的权重，影响对准确预测姿势关键点的重视程度。
kobj	2.0	姿势估计模型中关键点对象性损失的权重，平衡检测置信度与姿势精度。
label_smoothing	0.0	应用标签平滑，将硬标签软化到目标标签和标签上的均匀分布的混合，可提高泛化能力。
nbs	64	用于损失归一化的标称批大小。
overlap_mask	True	确定在训练期间分割掩码是否应重叠，适用于实例分割任务。
mask_ratio	4	分割掩码的下采样比率，影响训练期间使用的掩码分辨率。
dropout	0.0	分类任务中的正则化的丢弃率，通过在训练期间随机省略单元来防止过拟合。
val	True	在训练期间启用验证，允许在单独的数据集上定期评估模型性能。
plots	False	生成并保存训练和验证指标的图以及预测示例，提供对模型性能和学习进展的直观洞察。

2、Val（验证）

未完待续。。。