1.简介和安装
文档将介绍Tensorflow目标检测的API(link)。API可以用来检测图片或者视频流中的物体,可是是预先训练好的识别模型来检测物体,也可以是自己定义的需要分类的物体。首先,我们需要安装Tensorflow和相关的依赖。对于CPU的机器,直接安装 pip3 install tensorflow 就可以。对于使用GPU计算的机器,计算会更快,安装如下:
其他,可能需要提前安装的python3.7以及python 依赖库:
pip3 install pillow
pip3 install lxml
pip3 install jupyter
pip3 install matplotlib
接下来,需要从github上把Tensorflow 目标检测API项目克隆下来,或者直接下载zip包:
git clone https://github.com/tensorflow/models.git
模型文件下载下来以后,到指定目录,运行cmd:
windows机器需要下载 protoc-3.4.0-win32.zip(link),解压以后,bin目录下会有protoc的执行文件。
接下来,我们就可以在model目录下,用protoc来编译模型文件.proto:
"C:/Program Files/protoc/bin/protoc" object_detection/protos/*.proto --python_out=.
注意编译的目录位置
最后,在models/object_detection目录,打开终端terminal/cmd.exe,使用jupyter notebook,选择object_detection_tutorial.ipynb。选择run all,就可以得到如下结果:
2.视频流目标检测
本部分将包含如何基于示例代码,将其改造成基于摄像头识别的目标检测程序。首先,在object_detection目录下新建detect1.py文件。如果你就想要在notebook里面实现,也没有问题。
检测的机制是我们下载Google的预先训练模型ssd_mobilenet_v1_coco_11_06_2017 link,本模型预训练了大概90个分类,有人,茶杯,手提袋等物体的识别。下面的代码是从opencv来截取视频流,使用tensorflow加载ssd模型进行识别。
接下来,我们将使用opencv包来获取视频信息。如果还没有安装python opencv,可以参考link:
import cv2
cap = cv2.VideoCapture(0)
完整代码:
import os
import cv2
import numpy as np
import tensorflow as tf
from utils import label_map_util
from utils import visualization_utils as vis_util
cap = cv2.VideoCapture(0)
# http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
MODEL_NAME = 'ssd_mobilenet_v1_coco_11_06_2017'
MODEL_FILE = MODEL_NAME + '.tar.gz'
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
while True:
ret, image_np = cap.read()
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
# Actual detection.
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8)
cv2.imshow('object detection', cv2.resize(image_np, (800, 600)))
# 按q退出程序
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
3.自定义目标检测
本示例,我们将训练模型来识别吊车(crane)。首先可以先从百度上搜到一些吊车的图片样本。样本越多,模型识别当然越准确,本例只是演示,只下载十几张图片来说明。首先,进入models/research目录,本地安装object_detection的api:
进入models/research目录,执行python setup.py install 本地安装object_detection api库。
新建一个目录object_detection,然后我们将训练相关的内容都暂时放在这里面:
本部分全部完成以后目录结构如下:
images里面的图片可以选取感兴趣的需要分类的图片,比如吊车:
3.1 标注数据集
本文是基于有监督的分类学习算法,自然要对样本进行分类。图片下载好以后,我们要在图片上来标注哪个是吊车,也就是打上label。打label的工具用的是LabelImagelink。将标注的xml文件和图片保存到images目录。
安装labelimage这个工具的时候,你可能还需要安装一些依赖库:
pip3 install sip
apt-get install pyqt5-dev-tools
pip3 install -r requirements/requirements-linux-python3.txt
make qt5py3
pip3 install pandas
python3 labelImg.py
pip3 install pycocotools
打开文件夹为图片打标签:
3.2 构建训练的tfrecord文件
3.2.1.下载工具xml_to_csv工具
下载 https://github.com/datitran/raccoon_dataset 里面的xml_to_csv.py文件,即将xml转化为csv文件的工具基础代码
简单改造一下,指定本地的data目录:
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
def xml_to_csv(path):
xml_list = []
for xml_file in glob.glob(path + '/*.xml'):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall('object'):
value = (root.find('filename').text,
int(root.find('size')[0].text),
int(root.find('size')[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df
def main():
for directory in ['train', 'test']:
image_path = os.path.join(os.getcwd(),'images/{}'.format(directory))
xml_df = xml_to_csv(image_path)
xml_df.to_csv('data/{}_labels.csv'.format(directory), index=None)
print('Successfully converted xml to csv.')
main()
3.2.2 将csv文件转化为tfrecord文件
python3 generate_tfrecord.py —csv_input=data/test_labels.csv —output_path=data/test.record —image_dir=images/test
结果如下图:
3.3 模型训练
本部分,我们开始来训练自定义的模型算法。为此,我们需要图片训练集TFRecords和测试集TFRecords,其次我们需要模型配置文件和基础模型算法文件。这样,我们就可以利用tensorflow的转移学习基于一些预训练好的模型来继续训练出我们需要的算法(链接link)。转移学习可以快速得到训练算法,所需要的训练数据也相对较少。Tensorflow有不少预训练模型和模型对应的配置文件,本示例就选用了ssd_mobilenet_v1_coco_11_06_2017模型和相应的配置文件ssd_mobilenet_v1_pets.config:
3.3.1 预训练好的model模型文件
模型文件封装了识别用的基础算法,基于基础算法,我们利用object_detection api来训练我们感兴趣的算法模型 。直接下载模型:
wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
下载下来,解压到ssd_mobilenet_v1_coco_11_06_2017目录下:
3.3.2 模型配置文件 configuration file
wget https://raw.githubusercontent.com/tensorflow/models/master/object_detection/samples/configs/ssd_mobilenet_v1_pets.config
将模型文件下载下来放到training目录下:
原版的配置文件中,首先要搜索PATH_TO_BE_CONFIGURED,并且修改掉。
还需要修改batch size,这里设为24,其他的模型可能需要不同大小的批处理大小。如果出现内存错误,你可能就需要减小batch size。最后,还需要修改checkpoint的路径,num_classes修改为1,num_examples修改为12,以及一个label_map_path: “training/object-detect.pbtxt”
# SSD with Mobilenet v1, configured for Oxford-IIIT Pets Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.
model {
ssd {
# 将分类的数目设置为1
num_classes: 1
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 1
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
}
feature_extractor {
type: 'ssd_mobilenet_v1'
min_depth: 16
depth_multiplier: 1.0
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 0
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
batch_size: 24
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
from_detection_checkpoint: true
load_all_detection_checkpoint_vars: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "data/train.record"
}
label_map_path: "training/object-detection.pbtxt"
}
eval_config: {
metrics_set: "coco_detection_metrics"
num_examples: 12
}
eval_input_reader: {
tf_record_input_reader {
input_path: "data/test.record"
}
label_map_path: "training/object-detection.pbtxt"
shuffle: false
num_readers: 1
}
training目录下的object-detect.pbtxt的内容:
item {
id: 1
name: 'macncheese'
}
models/research目录下:
sudo python3 setup.py build
sudo python3 setup.py install
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
3.3.3 训练
接下来,激动人心的时刻到了,开始训练!
python3 model_main.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config
见到如下画面说明成功训练中:
训练一段时间,输入如下:
从Tensorboard看一下训练结果的total_loss:
3.4 模型测试
本部分,我们将测试训练好的模型,来看看是否和预期的一致。首先我们要导出模型计算图graph文件,以便测试程序可以直接引入使用。我们可以利用object_detection里面的export_inference_graph.py,只需要传递checkpoint和pipeline config文件就可以导出graph文件了。
python3 export_inference_graph.py \
--input_type image_tensor \
--pipeline_config_path training/ssd_mobilenet_v1_pets.config \
--trained_checkpoint_prefix training/model.ckpt-10856 \
--output_directory crane_inference_graph