YOLOV5超参数介绍以及优化策略-猿码集

1. YOLOV5超参数介绍

在深度学习训练中，超参数的设置非常重要。在YOLOV5的训练过程中，虽然有默认设置，但是根据自己的需求调整超参数可以大幅提高模型的精度。

1.1 Model

Model size是指网络的大小。该参数主要有三个值可选：small, medium, large。具体可通过修改yolov5/models/yolo.py的块数进行设置。下面是修改块数获得三个版本模型的代码：


# small
depth_multiple = 0.33  # 1.0 for large model，small是large的1/3
width_multiple = 0.50
backbone = [  # backbone要这样写才能让blocks生效
    nn.Conv2d(3, 32, 3, 1, 1),  # 0
    nn.Conv2d(32, 64, 3, 2, 1),  # 1-P1/2
    nn.CELU(),  # 2
    SPP(),  # 3
    nn.Conv2d(320, 64, 1, 1),  # 4
    nn.Conv2d(64, 64, 3, 1, 1),  # 5
    nn.CELU(),  # 6
    nn.Conv2d(64, 32, 1, 1),  # 7
]， #含有8个块(block)
# medium
depth_multiple = 0.67  # 1.0 for large model，medium是large的2/3
width_multiple = 0.75
backbone = [  # backbone要这样写才能让blocks生效
    nn.Conv2d(3, 32, 3, 1, 1),  # 0
    nn.Conv2d(32, 64, 3, 2, 1),  # 1-P1/2
    nn.CELU(),  # 2
    nn.Conv2d(64, 64, 3, 2, 1),  # 3-P2/4
    nn.CELU(),  # 4
    SPP(),  # 5
    nn.Conv2d(768, 128, 1, 1),  # 6
    nn.CELU(),  # 7
    nn.Conv2d(128, 64, 1, 1),  # 8
]， #含有9个块(block)
# large
depth_multiple = 1.0  # 1.0 for large model
width_multiple = 1.0
backbone = [  # backbone要这样写才能让blocks生效
    nn.Conv2d(3, 32, 3, 1, 1),  # 0
    nn.Conv2d(32, 64, 3, 2, 1),  # 1-P1/2
    nn.CELU(),  # 2
    nn.Conv2d(64, 64, 3, 2, 1),  # 3-P2/4
    nn.CELU(),  # 4
    nn.Conv2d(64, 128, 3, 2, 1),  # 5-P3/8
    nn.CELU(),  # 6
    SPP(),  # 7
    nn.Conv2d(1536, 256, 1, 1),  # 8
    nn.CELU(),  # 9
    nn.Conv2d(256, 128, 1, 1),  # 10
]， #含有11个块(block)

其中，depth_multiple是层数的缩放因子，在保持模型像素大小不变的前提下增加总体的网络长度。而width_multiple是通道数的缩放因子，在保持模型的深度不变的前提下增加每层的通道数。

1.2 Train

Batch_size表示每一次输入的样本数量。第一次训练时可以设置为2，之后可以根据实际情况调整，训练中内存不够时可能需要减小batch size来保证模型正常训练。如果增加batch size，也可能会显著提高模型的精度。

Image_size表示训练使用的图像尺寸，一般情况下，YOLOv5训练时使用416 * 416，如果出现OOM等问题，可以降低图像尺寸。

Epochs表示训练的次数。YOLOV5默认的Epochs是300，但根据不同的数据集大小和模型大小可能会有所不同。通过减少Epoch或者使用预训练的方式，可以加速模型训练。

1.3 Test

iou_threshold是指IOU的阈值，当IOU大于该值时表示两个框重合，这时会把得分小的框去除掉。一般默认值为0.45即可。

Confidence_threshold是指置信度的阈值，该值越大，框的数量就越少，模型输出的精度则会相应提高。但是同时，精度也会受到一定程度的影响。一般默认值为0.25即可。

以上就是YOLOV5常用的超参数介绍。

2. 优化策略

针对以上介绍的超参数，我们可以通过以下优化策略获得更好的模型精度。

2.1 学习率

学习率是指在训练时，每一次梯度下降时调整的步长。学习率通常设置为小数，常规情况下默认初始值是0.001，在训练过程中可根据实际情况调整，调整范围为0.0001 - 0.1。

使用官方提供的learning rate scheduler，不同的stage采用不同的lr，代码如下：


hyp['lr0'] = 0.01  # lr0-初次训练，60Epochs
hyp['lrf'] = 1e-4  # final learning rate (下降到lrf)
hyp['momentum'] = 0.937  # SGD momentum
hyp['weight_decay'] = 0.0005  # optimizer weight decay，参数太大时会欠收敛
break_epochs = 0  #Epochs of break (early stopping),提前终止训练
warmup_epochs = min(round(float(hyp['warmup_epochs']) * epochs), max(3, round(0.05 * epochs)))  # 热身,最大3-5%的epochs，其中调整lr
scheduler = GPTLR(optimizer, warmup_epochs=warmup_epochs, final_lr=hyp['lrf'], epochs=epochs,
                  n_batches=len(dataloader))， #官方learning rate scheduler
optimizer.zero_grad()
for epoch in range(start_epoch, epochs): # 循环epoch，分别进行训练
  # Prints mAP after each epoch
  if epoch == start_epoch or epoch % print_interval == 0:
      results, maps = val.run(data_dict, batch_size=batch_size,
                              imgsz=imgsz_test,
                              model=model,
                              single_cls=single_cls,
                              dataloader=val_loader,
                              save_dir=save_dir,
                              plot_imgs=plot_imgs and epoch == epochs - 1,
                              dataset=test_set,
                              conf=conf_thres,
                              iou=iou_thres,
                              save_json=save_json,
                              verbose=verbose and not plot_imgs)
      # Write epoch results
      with open(results_file, 'a') as f:
          f.write(s + '\n')

2.2 CutMix

CutMix指的是将多张不同图像的一部分割裂到另一张图像中，以合成新的训练数据。割裂的位置通过生成随机坐标决定。

在训练过程中增加套路，可以提高模型鲁棒性，同时也提高模型的学习效率。CutMix的代码如下：


#----------------------------------------------------------#
#   util.py中
#----------------------------------------------------------#
class CutMix(object):
    def __init__(self, cutmix_prob=1., cutmix_alpha=1., cutmix_beta=1.):
        self.cutmix_prob = cutmix_prob
        self.cutmix_alpha = cutmix_alpha
        self.cutmix_beta = cutmix_beta
    def forward(self, x, y):
        r = np.random.rand(1)
        if self.cutmix_prob > 0 and r < self.cutmix_prob:
            if self.cutmix_beta > 0:
                lam = np.random.beta(self.cutmix_alpha, self.cutmix_beta)
            else:
                lam = self.cutmix_alpha # default cutmix
            rand_index = torch.randperm(x.size()[0]).cuda()
            target_a = y
            target_b = y[rand_index]
            bbx1, bby1, bbx2, bby2 = rand_bbox(x.size(), lam)
            x[:, :, bbx1:bbx2, bby1:bby2] = x[rand_index, :, bbx1:bbx2, bby1:bby2]
            # compute the target area
            lam = 1 - ((bbx2 - bbx1) * (bby2 - bby1) / (x.size()[-1] * x.size()[-2]))
            return x, target_a, target_b, lam
        else:
            return x, y, y, 1.
    def backward(self, criterion, pred, y, target_a, target_b, lam):
        if self.cutmix_prob > 0 and lam < 1:
            return lam * criterion(pred, target_a) + (1 - lam) * criterion(pred, target_b)
        else:
            return criterion(pred, y)
def rand_bbox(size, lam):
    # cutmix算法
    W = size[2]
    H = size[3]
    cut_rat = np.sqrt(1. - lam)
    cut_w = np.int(W * cut_rat)
    cut_h = np.int(H * cut_rat)
    # uniform
    cx = np.random.randint(W)
    cy = np.random.randint(H)
    bbx1 = np.clip(cx - cut_w // 2, 0, W)
    bby1 = np.clip(cy - cut_h // 2, 0, H)
    bbx2 = np.clip(cx + cut_w // 2, 0, W)
    bby2 = np.clip(cy + cut_h // 2, 0, H)
    
    return bbx1, bby1, bbx2, bby2

2.3 MixUp

MixUp是在模型训练时将不同图像按一定的比例混合，生成新的训练数据，提高模型的精度。

MixUp和CutMix相同，在训练代码实现中进行检测，如果需要使用MixUp或者CutMix时，将数据进行预处理即可。

2.4 Soft NMS

Soft NMS将NMS算法改进，NMS常用于目标检测中，因为目标检测中往往同一个目标会被检测出多次。所以，在NMS算法中，将两者的iou值即重叠度比较高的框做出一个iof值，如果iof值较大，会将得分小的那个框删除。而Soft NMS通过降低得分范围，在NMS执行过程中，根据处理框的置信度不同，适当降低阈值（confidence threshold）而非直接删除框，又因为较低阈值的参数范围较大，因此保留的框的数量会较多，使较小的目标也能被检测到。

对于Soft NMS，在yolov5中的代码实现如下：


class NMS(nn.Module):
    def __init__(self, anchors, num_classes, conf_thres=0.1, nms_thres=0.6, ):
        super(NMS, self).__init__()
        self.anchors = torch.Tensor(anchors) / self.stride  # scaling anchors
        self.num_anchors = self.anchors.shape[0]  # number of anchors
        self.num_classes = num_classes  # number of classes
        self.ignore_thres = conf_thres
        self.obj_thresh = conf_thres
        self.nms_thresh = nms_thres
        self.label_smooth_eps = 0.15  # label smoothing
        self._build_extensions()
    def forward(self, p, img_size, augment=False):
        return non_max_suppression(p, self.conf_thres, self.nms_thresh, self.num_classes, self.anchors.to(p.device),
                                   img_size, augment=augment, label_smoothing=self.label_smooth_eps)
    def _build_extensions(self):
        # ----------- NMS ----------
        def xx(ltrb, conf, min_wh=None, multi_label=True, classes=None):  # NMS globally across classes
            default_anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192,
                               243, 459, 401]  # P3-P7
            conf = conf.sigmoid().squeeze() if conf is not None else None
            # if multi_label and conf is not None and conf.ndim > 1:
            #     pred, conf = pred[conf > self.conf_thres], conf[conf > self.conf_thres]
            #     if not len(conf):  # no boxes
            #         return []
            l, t, r, b = ltrb.unbind(1)
            areas = (r - l) * (b - t)
            if min_wh is not None:  # limit anchor boxes to ignore areas with < min_wh pixels
                mask = areas >= min_wh
                l, t, r, b = l[mask], t[mask], r[mask], b[mask]
                if conf is not None:
                    conf = conf[mask]
                # if multi_label:
                #     pred = pred[mask]
                areas = areas[mask]
            if not len(areas):  # no boxes
                return []
            if conf is None:
                conf = torch.ones((len(l),), device=l.device)
            # compute sort order and IoU
            order = torch.argsort(conf)  # order index by objectness
            l = l[order]
            t = t[order]
            r = r[order]
            b = b[order]
            areas = areas[order]
            # if multi_label:
            #     pred = pred[order]
            if classes is not None:  # filter by class
                class_indices = (classes[order] == classes[:, None]).max(-1)[0]  # binary mask
                l = l[class_indices]
                t = t[class_indices]
                r = r[class_indices]
                b = b[class_indices]
                # if multi_label:
                #     pred = pred[class_indices]
                areas = areas[class_indices]
            iou = box_iou((l, t, r, b), (l, t, r, b)).tril_(diagonal=-1)
            iou, indices = iou.sort(descending=True)

YOLOV5超参数介绍以及优化策略

1. YOLOV5超参数介绍

1.1 Model

1.2 Train

1.3 Test

2. 优化策略

2.1 学习率

2.2 CutMix

2.3 MixUp

2.4 Soft NMS

相关阅读

后端开发标签

Python热门

Python更新