目标检测の模块化代码【持续更新中】 - 《图像处理の深度学习》

一、Bbox面积计算
二、Bbox中心点计算
三、Bbox之间的交并比计算
四、将图像格式转化至模型输入的格式
五、检测结果过滤
六、多线程推理
七、基于多帧上检测结果的平滑过滤
零、其他

项目开发时发现部分代码可能需要重复调用。这部分代码可以写到同一个utils文件中，减低其他代码的冗余性。

一、Bbox面积计算

def compute_area(boxes):
    '''
    Args
        boxes: (N,4) ndarray of float. y1,x1,y2,x2
    '''
    import numpy as np
    areas = np.zeros((len(boxes), dtype=np.float64)
    for i in range(len(boxes)):
            areas[i] = (boxes[i, 2] - boxes[i,0]) * (boxes[i, 3] - boxes[i, 1])
    return areas

二、Bbox中心点计算

def get_center(boxes):
    '''
    Args
        boxes: (N,4) ndarray of float. x1,y1,x2,y2
    '''
    centers = [((boxes[0] + boxes[2]) / 2, (boxes[1] + boxes[3]) / 2) for box in boxes]
    return centers

三、Bbox之间的交并比计算

def compute_overlap(boxes1, boxes2):
    """
    Args
        a: (N, 4) ndarray of float
        b: (K, 4) ndarray of float
    """
    import numpy as np
    for k in range(K):
        box_area = (
            (boxes2[k, 2] - boxes2[k, 0]) *
            (boxes2[k, 3] - boxes2[k, 1])
        )
        for n in range(N):
            iw = (
                min(boxes1[n, 2], boxes2[k, 2]) -
                max(boxes1[n, 0], boxes2[k, 0]) 
            )
            if iw > 0:
                ih = (
                    min(boxes1[n, 3], boxes2[k, 3]) -
                    max(boxes1[n, 1], boxes2[k, 1]) 
                )
                if ih > 0:
                    ua = np.float64(
                        (boxes1[n, 2] - boxes1[n, 0]) *
                        (boxes1[n, 3] - boxes1[n, 1]) +
                        box_area - iw * ih
                    )
                    overlaps[n, k] = iw * ih / ua
    return overlaps

    def compute_iou(boxes_a, boxes_b):
        """
        numpy 计算IoU
        :param boxes_a: [N,4]
        :param boxes_b: [M,4]
        :return:  IoU [N,M]
        """
        import numpy as np
        # 扩维
        boxes_a = np.expand_dims(boxes_a, axis=1)  # (N,1,4)
        boxes_b = np.expand_dims(boxes_b, axis=0)  # (1,M,4)
        # 分别计算高度和宽度的交集
        overlap_h = np.maximum(0.0, np.minimum(boxes_a[..., 2], boxes_b[..., 2]) - np.maximum(boxes_a[..., 0], boxes_b[..., 0]))  # (N,M)
        overlap_w = np.maximum(0.0, np.minimum(boxes_a[..., 3], boxes_b[..., 3]) - np.maximum(boxes_a[..., 1], boxes_b[..., 1]))  # (N,M)
        # 通过点乘（对应元素相乘）计算交集
        overlap = overlap_w * overlap_h
        # 计算面积
        area_a = (boxes_a[..., 2] - boxes_a[..., 0]) * (boxes_a[..., 3] - boxes_a[..., 1])
        area_b = (boxes_b[..., 2] - boxes_b[..., 0]) * (boxes_b[..., 3] - boxes_b[..., 1])
        # 交并比
        iou = overlap / (area_a + area_b - overlap)
        return iou

四、将图像格式转化至模型输入的格式

def load_image_into_numpy_array(image_path):
    """
    :param image_path:
    :return:
    """
    import cv2
    import Image
    import numpy as np
    image = Image.open(image_path).convert('RGB')  
    # 注意cv2的图像通道顺序是BGR，而TensorFlow、pytorch支持的图像通道顺序是RGB。
    image = cv2.resize(np.array(image), (300, 300)).astype(np.uint8)
    return image

五、检测结果过滤

def filter_detect_boxes(boxes, scores, classes, min_score, max_boxes):
    """
    :param boxes:
    :param scores:
    :param classes:
    :param min_score:最小得分阈值
    :param max_boxes:单帧最大Bbox检测数量
    :return:
    """
    idx = scores >= min_score
    boxes = boxes[idx]
    classes = classes[idx]
    scores = scores[idx]
    return boxes[:max_boxes], scores[:max_boxes], classes[:max_boxes]

六、多线程推理

import multiprocessing
print("start to detect  ......")
processes = []
for i in range(12): # 线程数，可以按照待检测图像或视频数量做一个自适应调整。
    images = [image_path for idx, image_path in enumerate(TEST_IMAGE_PATHS) if idx % 12 == i]
    p = multiprocessing.Process(target=save_result, args=(images, boxes_path, classes_path, vis_output_path))
    # p = multiprocessing.Process(target=函数名, args=(前面指定函数所对应的参数，含images这个待检测的图像列表))
    p.start()
    processes.append(p)
    # 等待执行完成
    for p in processes:
        p.join()

七、基于多帧上检测结果的平滑过滤

window_detect = [] # 需要写在循环外面
# 数据处理过程略
boxes, scores, classes = inference_and_filter(image)
window_detect.append(classes) # 或者boxes, scores
if len(window_detect) == window_size:
    # 进行逻辑处理，如过滤偶尔误报、利用滑窗均值代表当前帧的score检测结果等
    window_detect.pop(0)

零、其他

针对region of interest进行检测的结果，可以通过坐标转化直接放到原图上可视化。
针对形变后的region of interest进行检测的结果，可以先通过相同形变的逆过程将结果返回至roi区域，最后再放到原图上可视化。