caffe(conversation architecture for faster feature embeding)伯克利视觉和学习中心基于c++/cuda/python实现卷积神经网络框架
前身decaf,作者贾扬清
caffe优点
1)速度快
2)适合特征提取
3)开源
4)一套工具集,实现模型训练,预测,微调等
5)代码组织良好
caffe依赖包解析
1)protobuffer 由谷歌开发实现内存与非易失存储介质交换协议接口
2)boost c++ 准标准库
3)GFLAGS在caffe起命令行参数解析的作用
4)GLOG库谷歌开发记录应用程序日志的实用库
5)BLAS 基本线性代数子程序
6)HDF5 高效存储和分发科学数据的新型数据格式
7)opencv 开源计算机视觉库
8)LMDB 内存映射型数据库管理器
9)sanppy 压缩和解压缩的c++库
caffe实现
1)数据生成lmdb(提高io利用率)
2)模型文件
train.prototxt
name: "ssdJacintoNetV2"layer {name: "data"type: "AnnotatedData"top: "data"top: "label"include {phase: TRAIN}transform_param {mirror: truemean_value: 0mean_value: 0mean_value: 0force_color: falseresize_param {prob: 1resize_mode: WARPheight: 160 #160width: 320 #320interp_mode: LINEARinterp_mode: AREAinterp_mode: NEARESTinterp_mode: CUBICinterp_mode: LANCZOS4}emit_constraint {emit_type: CENTER}distort_param {brightness_prob: 0.5brightness_delta: 32contrast_prob: 0.5contrast_lower: 0.5contrast_upper: 1.5hue_prob: 0.5hue_delta: 18saturation_prob: 0.5saturation_lower: 0.5saturation_upper: 1.5random_order_prob: 0.0}expand_param {prob: 0.5max_expand_ratio: 4.0}}data_param {source: "/home/caffe/examples/bsd_ti_5l_320/bsd_left/lmdb/bsd_left_trainval_lmdb"batch_size: 16backend: LMDB}annotated_data_param {batch_sampler {max_sample: 1max_trials: 1}batch_sampler {sampler {min_scale: 0.3max_scale: 1.0min_aspect_ratio: 0.5max_aspect_ratio: 2.0}sample_constraint {min_jaccard_overlap: 0.1}max_sample: 1max_trials: 50}batch_sampler {sampler {min_scale: 0.3max_scale: 1.0min_aspect_ratio: 0.5max_aspect_ratio: 2.0}sample_constraint {min_jaccard_overlap: 0.3}max_sample: 1max_trials: 50}batch_sampler {sampler {min_scale: 0.3max_scale: 1.0min_aspect_ratio: 0.5max_aspect_ratio: 2.0}sample_constraint {min_jaccard_overlap: 0.5}max_sample: 1max_trials: 50}batch_sampler {sampler {min_scale: 0.3max_scale: 1.0min_aspect_ratio: 0.5max_aspect_ratio: 2.0}sample_constraint {min_jaccard_overlap: 0.7}max_sample: 1max_trials: 50}batch_sampler {sampler {min_scale: 0.3max_scale: 1.0min_aspect_ratio: 0.5max_aspect_ratio: 2.0}sample_constraint {min_jaccard_overlap: 0.9}max_sample: 1max_trials: 50}batch_sampler {sampler {min_scale: 0.3max_scale: 1.0min_aspect_ratio: 0.5max_aspect_ratio: 2.0}sample_constraint {max_jaccard_overlap: 1.0}max_sample: 1max_trials: 50}label_map_file: "./labelmap_bsd3class.prototxt"}}layer {name: "data"type: "AnnotatedData"top: "data"top: "label"include {phase: TEST}transform_param {mean_value: 0mean_value: 0mean_value: 0force_color: falseresize_param {prob: 1resize_mode: WARPheight: 160 #160width: 320 #320interp_mode: LINEAR}}data_param {source: "/home/caffe/examples/bsd_ti_5l_320/bsd_left/lmdb/bsd_left_test_lmdb"batch_size: 12backend: LMDB}annotated_data_param {batch_sampler {}label_map_file: "./labelmap_bsd3class.prototxt"}}layer {name: "data/bias"type: "Bias"bottom: "data"top: "data/bias"param {lr_mult: 0decay_mult: 0}bias_param {filler {type: "constant"value: -128}}}layer {name: "conv1a"type: "Convolution"bottom: "data/bias"top: "conv1a"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 32bias_term: truepad: 2kernel_size: 5group: 1stride: 2weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "conv1a/bn"type: "BatchNorm"bottom: "conv1a"top: "conv1a"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "conv1a/relu"type: "ReLU"bottom: "conv1a"top: "conv1a"}layer {name: "conv1b"type: "Convolution"bottom: "conv1a"top: "conv1b"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 32bias_term: truepad: 1kernel_size: 3group: 4stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "conv1b/bn"type: "BatchNorm"bottom: "conv1b"top: "conv1b"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "conv1b/relu"type: "ReLU"bottom: "conv1b"top: "conv1b"}layer {name: "pool1"type: "Pooling"bottom: "conv1b"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2}}layer {name: "res2a_branch2a"type: "Convolution"bottom: "pool1"top: "res2a_branch2a"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 64bias_term: truepad: 1kernel_size: 3group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res2a_branch2a/bn"type: "BatchNorm"bottom: "res2a_branch2a"top: "res2a_branch2a"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res2a_branch2a/relu"type: "ReLU"bottom: "res2a_branch2a"top: "res2a_branch2a"}layer {name: "res2a_branch2b"type: "Convolution"bottom: "res2a_branch2a"top: "res2a_branch2b"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 64bias_term: truepad: 1kernel_size: 3group: 4stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res2a_branch2b/bn"type: "BatchNorm"bottom: "res2a_branch2b"top: "res2a_branch2b"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res2a_branch2b/relu"type: "ReLU"bottom: "res2a_branch2b"top: "res2a_branch2b"}layer {name: "pool2"type: "Pooling"bottom: "res2a_branch2b"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2}}layer {name: "res3a_branch2a"type: "Convolution"bottom: "pool2"top: "res3a_branch2a"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 128bias_term: truepad: 1kernel_size: 3group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res3a_branch2a/bn"type: "BatchNorm"bottom: "res3a_branch2a"top: "res3a_branch2a"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res3a_branch2a/relu"type: "ReLU"bottom: "res3a_branch2a"top: "res3a_branch2a"}layer {name: "res3a_branch2b"type: "Convolution"bottom: "res3a_branch2a"top: "res3a_branch2b"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 128bias_term: truepad: 1kernel_size: 3group: 4stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res3a_branch2b/bn"type: "BatchNorm"bottom: "res3a_branch2b"top: "res3a_branch2b"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res3a_branch2b/relu"type: "ReLU"bottom: "res3a_branch2b"top: "res3a_branch2b"}layer {name: "pool3"type: "Pooling"bottom: "res3a_branch2b"top: "pool3"pooling_param {pool: MAXkernel_size: 2stride: 2}}layer {name: "res4a_branch2a"type: "Convolution"bottom: "pool3"top: "res4a_branch2a"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256bias_term: truepad: 1kernel_size: 3group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res4a_branch2a/bn"type: "BatchNorm"bottom: "res4a_branch2a"top: "res4a_branch2a"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res4a_branch2a/relu"type: "ReLU"bottom: "res4a_branch2a"top: "res4a_branch2a"}layer {name: "res4a_branch2b"type: "Convolution"bottom: "res4a_branch2a"top: "res4a_branch2b"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256bias_term: truepad: 1kernel_size: 3group: 4stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res4a_branch2b/bn"type: "BatchNorm"bottom: "res4a_branch2b"top: "res4a_branch2b"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res4a_branch2b/relu"type: "ReLU"bottom: "res4a_branch2b"top: "res4a_branch2b"}layer {name: "pool4"type: "Pooling"bottom: "res4a_branch2b"top: "pool4"pooling_param {pool: MAXkernel_size: 2stride: 2}}layer {name: "res5a_branch2a"type: "Convolution"bottom: "pool4"top: "res5a_branch2a"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 512bias_term: truepad: 1kernel_size: 3group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res5a_branch2a/bn"type: "BatchNorm"bottom: "res5a_branch2a"top: "res5a_branch2a"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res5a_branch2a/relu"type: "ReLU"bottom: "res5a_branch2a"top: "res5a_branch2a"}layer {name: "res5a_branch2b"type: "Convolution"bottom: "res5a_branch2a"top: "res5a_branch2b"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 512bias_term: truepad: 1kernel_size: 3group: 4stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "res5a_branch2b/bn"type: "BatchNorm"bottom: "res5a_branch2b"top: "res5a_branch2b"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "res5a_branch2b/relu"type: "ReLU"bottom: "res5a_branch2b"top: "res5a_branch2b"}layer {name: "pool6"type: "Pooling"bottom: "res5a_branch2b"top: "pool6"pooling_param {pool: MAXkernel_size: 2stride: 2pad: 0}}layer {name: "pool7"type: "Pooling"bottom: "pool6"top: "pool7"pooling_param {pool: MAXkernel_size: 2stride: 2pad: 0}}layer {name: "pool8"type: "Pooling"bottom: "pool7"top: "pool8"pooling_param {pool: MAXkernel_size: 2stride: 2pad: 0}}layer {name: "ctx_output1"type: "Convolution"bottom: "res4a_branch2b"top: "ctx_output1"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output1/bn"type: "BatchNorm"bottom: "ctx_output1"top: "ctx_output1"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "ctx_output1/relu"type: "ReLU"bottom: "ctx_output1"top: "ctx_output1"}layer {name: "ctx_output2"type: "Convolution"bottom: "res5a_branch2b"top: "ctx_output2"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output2/bn"type: "BatchNorm"bottom: "ctx_output2"top: "ctx_output2"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "ctx_output2/relu"type: "ReLU"bottom: "ctx_output2"top: "ctx_output2"}layer {name: "ctx_output3"type: "Convolution"bottom: "pool6"top: "ctx_output3"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output3/bn"type: "BatchNorm"bottom: "ctx_output3"top: "ctx_output3"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "ctx_output3/relu"type: "ReLU"bottom: "ctx_output3"top: "ctx_output3"}layer {name: "ctx_output4"type: "Convolution"bottom: "pool7"top: "ctx_output4"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output4/bn"type: "BatchNorm"bottom: "ctx_output4"top: "ctx_output4"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "ctx_output4/relu"type: "ReLU"bottom: "ctx_output4"top: "ctx_output4"}layer {name: "ctx_output5"type: "Convolution"bottom: "pool8"top: "ctx_output5"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 256bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output5/bn"type: "BatchNorm"bottom: "ctx_output5"top: "ctx_output5"batch_norm_param {moving_average_fraction: 0.99eps: 0.0001#scale_bias: true}}layer {name: "ctx_output5/relu"type: "ReLU"bottom: "ctx_output5"top: "ctx_output5"}layer {name: "ctx_output1/relu_mbox_loc"type: "Convolution"bottom: "ctx_output1"top: "ctx_output1/relu_mbox_loc"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 16bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output1/relu_mbox_loc_perm"type: "Permute"bottom: "ctx_output1/relu_mbox_loc"top: "ctx_output1/relu_mbox_loc_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output1/relu_mbox_loc_flat"type: "Flatten"bottom: "ctx_output1/relu_mbox_loc_perm"top: "ctx_output1/relu_mbox_loc_flat"flatten_param {axis: 1}}layer {name: "ctx_output1/relu_mbox_conf"type: "Convolution"bottom: "ctx_output1"top: "ctx_output1/relu_mbox_conf"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 16#84bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output1/relu_mbox_conf_perm"type: "Permute"bottom: "ctx_output1/relu_mbox_conf"top: "ctx_output1/relu_mbox_conf_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output1/relu_mbox_conf_flat"type: "Flatten"bottom: "ctx_output1/relu_mbox_conf_perm"top: "ctx_output1/relu_mbox_conf_flat"flatten_param {axis: 1}}layer {name: "ctx_output1/relu_mbox_priorbox"type: "PriorBox"bottom: "ctx_output1"bottom: "data"top: "ctx_output1/relu_mbox_priorbox"prior_box_param {min_size: 24 #14.72max_size: 50.24 #36.8aspect_ratio: 2flip: trueclip: falsevariance: 0.1variance: 0.1variance: 0.2variance: 0.2offset: 0.5}}layer {name: "ctx_output2/relu_mbox_loc"type: "Convolution"bottom: "ctx_output2"top: "ctx_output2/relu_mbox_loc"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 24bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output2/relu_mbox_loc_perm"type: "Permute"bottom: "ctx_output2/relu_mbox_loc"top: "ctx_output2/relu_mbox_loc_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output2/relu_mbox_loc_flat"type: "Flatten"bottom: "ctx_output2/relu_mbox_loc_perm"top: "ctx_output2/relu_mbox_loc_flat"flatten_param {axis: 1}}layer {name: "ctx_output2/relu_mbox_conf"type: "Convolution"bottom: "ctx_output2"top: "ctx_output2/relu_mbox_conf"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 24#126bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output2/relu_mbox_conf_perm"type: "Permute"bottom: "ctx_output2/relu_mbox_conf"top: "ctx_output2/relu_mbox_conf_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output2/relu_mbox_conf_flat"type: "Flatten"bottom: "ctx_output2/relu_mbox_conf_perm"top: "ctx_output2/relu_mbox_conf_flat"flatten_param {axis: 1}}layer {name: "ctx_output2/relu_mbox_priorbox"type: "PriorBox"bottom: "ctx_output2"bottom: "data"top: "ctx_output2/relu_mbox_priorbox"prior_box_param {min_size: 50.24 #36.8max_size: 76.48 #110.4aspect_ratio: 2aspect_ratio: 3flip: trueclip: falsevariance: 0.1variance: 0.1variance: 0.2variance: 0.2offset: 0.5}}layer {name: "ctx_output3/relu_mbox_loc"type: "Convolution"bottom: "ctx_output3"top: "ctx_output3/relu_mbox_loc"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 24bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output3/relu_mbox_loc_perm"type: "Permute"bottom: "ctx_output3/relu_mbox_loc"top: "ctx_output3/relu_mbox_loc_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output3/relu_mbox_loc_flat"type: "Flatten"bottom: "ctx_output3/relu_mbox_loc_perm"top: "ctx_output3/relu_mbox_loc_flat"flatten_param {axis: 1}}layer {name: "ctx_output3/relu_mbox_conf"type: "Convolution"bottom: "ctx_output3"top: "ctx_output3/relu_mbox_conf"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 24#126bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output3/relu_mbox_conf_perm"type: "Permute"bottom: "ctx_output3/relu_mbox_conf"top: "ctx_output3/relu_mbox_conf_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output3/relu_mbox_conf_flat"type: "Flatten"bottom: "ctx_output3/relu_mbox_conf_perm"top: "ctx_output3/relu_mbox_conf_flat"flatten_param {axis: 1}}layer {name: "ctx_output3/relu_mbox_priorbox"type: "PriorBox"bottom: "ctx_output3"bottom: "data"top: "ctx_output3/relu_mbox_priorbox"prior_box_param {min_size: 76.48 #110.4max_size: 102.72 #184.0aspect_ratio: 2aspect_ratio: 3flip: trueclip: falsevariance: 0.1variance: 0.1variance: 0.2variance: 0.2offset: 0.5}}layer {name: "ctx_output4/relu_mbox_loc"type: "Convolution"bottom: "ctx_output4"top: "ctx_output4/relu_mbox_loc"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 24bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output4/relu_mbox_loc_perm"type: "Permute"bottom: "ctx_output4/relu_mbox_loc"top: "ctx_output4/relu_mbox_loc_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output4/relu_mbox_loc_flat"type: "Flatten"bottom: "ctx_output4/relu_mbox_loc_perm"top: "ctx_output4/relu_mbox_loc_flat"flatten_param {axis: 1}}layer {name: "ctx_output4/relu_mbox_conf"type: "Convolution"bottom: "ctx_output4"top: "ctx_output4/relu_mbox_conf"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 24#126bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output4/relu_mbox_conf_perm"type: "Permute"bottom: "ctx_output4/relu_mbox_conf"top: "ctx_output4/relu_mbox_conf_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output4/relu_mbox_conf_flat"type: "Flatten"bottom: "ctx_output4/relu_mbox_conf_perm"top: "ctx_output4/relu_mbox_conf_flat"flatten_param {axis: 1}}layer {name: "ctx_output4/relu_mbox_priorbox"type: "PriorBox"bottom: "ctx_output4"bottom: "data"top: "ctx_output4/relu_mbox_priorbox"prior_box_param {min_size: 102.72 #184.0max_size: 128.96 #257.6aspect_ratio: 2aspect_ratio: 3flip: trueclip: falsevariance: 0.1variance: 0.1variance: 0.2variance: 0.2offset: 0.5}}layer {name: "ctx_output5/relu_mbox_loc"type: "Convolution"bottom: "ctx_output5"top: "ctx_output5/relu_mbox_loc"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 16bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output5/relu_mbox_loc_perm"type: "Permute"bottom: "ctx_output5/relu_mbox_loc"top: "ctx_output5/relu_mbox_loc_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output5/relu_mbox_loc_flat"type: "Flatten"bottom: "ctx_output5/relu_mbox_loc_perm"top: "ctx_output5/relu_mbox_loc_flat"flatten_param {axis: 1}}layer {name: "ctx_output5/relu_mbox_conf"type: "Convolution"bottom: "ctx_output5"top: "ctx_output5/relu_mbox_conf"param {lr_mult: 1decay_mult: 1}param {lr_mult: 2decay_mult: 0}convolution_param {num_output: 16#84bias_term: truepad: 0kernel_size: 1group: 1stride: 1weight_filler {type: "msra"}bias_filler {type: "constant"value: 0}dilation: 1}}layer {name: "ctx_output5/relu_mbox_conf_perm"type: "Permute"bottom: "ctx_output5/relu_mbox_conf"top: "ctx_output5/relu_mbox_conf_perm"permute_param {order: 0order: 2order: 3order: 1}}layer {name: "ctx_output5/relu_mbox_conf_flat"type: "Flatten"bottom: "ctx_output5/relu_mbox_conf_perm"top: "ctx_output5/relu_mbox_conf_flat"flatten_param {axis: 1}}layer {name: "ctx_output5/relu_mbox_priorbox"type: "PriorBox"bottom: "ctx_output5"bottom: "data"top: "ctx_output5/relu_mbox_priorbox"prior_box_param {min_size: 128.96 #257.6max_size: 160 #331.2aspect_ratio: 2flip: trueclip: falsevariance: 0.1variance: 0.1variance: 0.2variance: 0.2offset: 0.5}}layer {name: "mbox_loc"type: "Concat"bottom: "ctx_output1/relu_mbox_loc_flat"bottom: "ctx_output2/relu_mbox_loc_flat"bottom: "ctx_output3/relu_mbox_loc_flat"bottom: "ctx_output4/relu_mbox_loc_flat"bottom: "ctx_output5/relu_mbox_loc_flat"top: "mbox_loc"concat_param {axis: 1}}layer {name: "mbox_conf"type: "Concat"bottom: "ctx_output1/relu_mbox_conf_flat"bottom: "ctx_output2/relu_mbox_conf_flat"bottom: "ctx_output3/relu_mbox_conf_flat"bottom: "ctx_output4/relu_mbox_conf_flat"bottom: "ctx_output5/relu_mbox_conf_flat"top: "mbox_conf"concat_param {axis: 1}}layer {name: "mbox_priorbox"type: "Concat"bottom: "ctx_output1/relu_mbox_priorbox"bottom: "ctx_output2/relu_mbox_priorbox"bottom: "ctx_output3/relu_mbox_priorbox"bottom: "ctx_output4/relu_mbox_priorbox"bottom: "ctx_output5/relu_mbox_priorbox"top: "mbox_priorbox"concat_param {axis: 2}}layer {name: "mbox_loss"type: "MultiBoxLoss"bottom: "mbox_loc"bottom: "mbox_conf"bottom: "mbox_priorbox"bottom: "label"top: "mbox_loss"include {phase: TRAIN}propagate_down: truepropagate_down: truepropagate_down: falsepropagate_down: falseloss_param {normalization: VALID}multibox_loss_param {loc_loss_type: SMOOTH_L1conf_loss_type: SOFTMAXloc_weight: 1.0num_classes: 4share_location: truematch_type: PER_PREDICTIONoverlap_threshold: 0.5use_prior_for_matching: truebackground_label_id: 0use_difficult_gt: trueneg_pos_ratio: 3.0neg_overlap: 0.5code_type: CENTER_SIZEignore_cross_boundary_bbox: falsemining_type: MAX_NEGATIVE}}layer {name: "mbox_conf_reshape"type: "Reshape"bottom: "mbox_conf"top: "mbox_conf_reshape"reshape_param {shape {dim: 0dim: -1dim: 4}}}layer {name: "mbox_conf_softmax"type: "Softmax"bottom: "mbox_conf_reshape"top: "mbox_conf_softmax"softmax_param {axis: 2}}layer {name: "mbox_conf_flatten"type: "Flatten"bottom: "mbox_conf_softmax"top: "mbox_conf_flatten"flatten_param {axis: 1}}layer {name: "detection_out"type: "DetectionOutput"bottom: "mbox_loc"bottom: "mbox_conf_flatten"bottom: "mbox_priorbox"top: "detection_out"include {phase: TEST}detection_output_param {num_classes: 4share_location: truebackground_label_id: 0nms_param {nms_threshold: 0.45top_k: 400}code_type: CENTER_SIZEkeep_top_k: 200confidence_threshold: 0.01}}layer {name: "detection_eval"type: "DetectionEvaluate"bottom: "detection_out"bottom: "label"top: "detection_eval"include {phase: TEST}detection_evaluate_param {num_classes: 4background_label_id: 0overlap_threshold: 0.5evaluate_difficult_gt: false#name_size_file: "/home/caffe/examples/bsd_ti_5l_320/bsd_r_mini/VOC2007/test_name_size.txt"}}
3) 训练超参数
net: "ti_bsd_5little_200_400_4pr.prototxt" ##########
test_iter: 10 ##########
test_interval: 100 ##########
base_lr: 0.01 #0.0005
display: 100
max_iter: 100000 ##########
lr_policy: "multistep"
gamma: 0.1
power: 1.0
momentum: 0.9
weight_decay: 0.0001
snapshot: 10000
snapshot_prefix: "snapshot/4pr" ###########
solver_mode: GPU
debug_info: false
snapshot_after_train: true
test_initialization: false
average_loss: 10
stepvalue: 60000
stepvalue: 90000
stepvalue: 300000
iter_size: 2
type: "SGD"
eval_type: "detection"
ap_version: "11point"
show_per_class_result: true
4)模型训练
sh train.sh
#!/bin/sh
../../build/tools/caffe train -solver="solver_train.prototxt" -gpu=0 -snapshot="./snapshot/4pr_iter_100000.solverstate" -gpu=0
2>&1 | tee ./bsd.log
5)模型预测 ./build/tools/caffe.bin test -model ./test.prototxt -weights ./lenet_iter_1000.caffemodel -iterations 100 (bachsize*iter=number of test images)
deploy文件
name: "ssdJacintoNetV2"
input: "data"
input_shape{
dim=1
dim=3
dim: 160 #160
dim: 320 #320
}
layer {
name: "data/bias"
type: "Bias"
bottom: "data"
top: "data/bias"
param {
lr_mult: 0
decay_mult: 0
}
bias_param {
filler {
type: "constant"
value: -128
}
}
}
layer {
name: "conv1a"
type: "Convolution"
bottom: "data/bias"
top: "conv1a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
bias_term: true
pad: 2
kernel_size: 5
group: 1
stride: 2
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "conv1a/bn"
type: "BatchNorm"
bottom: "conv1a"
top: "conv1a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "conv1a/relu"
type: "ReLU"
bottom: "conv1a"
top: "conv1a"
}
layer {
name: "conv1b"
type: "Convolution"
bottom: "conv1a"
top: "conv1b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "conv1b/bn"
type: "BatchNorm"
bottom: "conv1b"
top: "conv1b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "conv1b/relu"
type: "ReLU"
bottom: "conv1b"
top: "conv1b"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1b"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "res2a_branch2a"
type: "Convolution"
bottom: "pool1"
top: "res2a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res2a_branch2a/bn"
type: "BatchNorm"
bottom: "res2a_branch2a"
top: "res2a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res2a_branch2a/relu"
type: "ReLU"
bottom: "res2a_branch2a"
top: "res2a_branch2a"
}
layer {
name: "res2a_branch2b"
type: "Convolution"
bottom: "res2a_branch2a"
top: "res2a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res2a_branch2b/bn"
type: "BatchNorm"
bottom: "res2a_branch2b"
top: "res2a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res2a_branch2b/relu"
type: "ReLU"
bottom: "res2a_branch2b"
top: "res2a_branch2b"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "res2a_branch2b"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "res3a_branch2a"
type: "Convolution"
bottom: "pool2"
top: "res3a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res3a_branch2a/bn"
type: "BatchNorm"
bottom: "res3a_branch2a"
top: "res3a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res3a_branch2a/relu"
type: "ReLU"
bottom: "res3a_branch2a"
top: "res3a_branch2a"
}
layer {
name: "res3a_branch2b"
type: "Convolution"
bottom: "res3a_branch2a"
top: "res3a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res3a_branch2b/bn"
type: "BatchNorm"
bottom: "res3a_branch2b"
top: "res3a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res3a_branch2b/relu"
type: "ReLU"
bottom: "res3a_branch2b"
top: "res3a_branch2b"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "res3a_branch2b"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "res4a_branch2a"
type: "Convolution"
bottom: "pool3"
top: "res4a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res4a_branch2a/bn"
type: "BatchNorm"
bottom: "res4a_branch2a"
top: "res4a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res4a_branch2a/relu"
type: "ReLU"
bottom: "res4a_branch2a"
top: "res4a_branch2a"
}
layer {
name: "res4a_branch2b"
type: "Convolution"
bottom: "res4a_branch2a"
top: "res4a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res4a_branch2b/bn"
type: "BatchNorm"
bottom: "res4a_branch2b"
top: "res4a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res4a_branch2b/relu"
type: "ReLU"
bottom: "res4a_branch2b"
top: "res4a_branch2b"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "res4a_branch2b"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "res5a_branch2a"
type: "Convolution"
bottom: "pool4"
top: "res5a_branch2a"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 1
kernel_size: 3
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res5a_branch2a/bn"
type: "BatchNorm"
bottom: "res5a_branch2a"
top: "res5a_branch2a"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res5a_branch2a/relu"
type: "ReLU"
bottom: "res5a_branch2a"
top: "res5a_branch2a"
}
layer {
name: "res5a_branch2b"
type: "Convolution"
bottom: "res5a_branch2a"
top: "res5a_branch2b"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
bias_term: true
pad: 1
kernel_size: 3
group: 4
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "res5a_branch2b/bn"
type: "BatchNorm"
bottom: "res5a_branch2b"
top: "res5a_branch2b"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "res5a_branch2b/relu"
type: "ReLU"
bottom: "res5a_branch2b"
top: "res5a_branch2b"
}
layer {
name: "pool6"
type: "Pooling"
bottom: "res5a_branch2b"
top: "pool6"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 0
}
}
layer {
name: "pool7"
type: "Pooling"
bottom: "pool6"
top: "pool7"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 0
}
}
layer {
name: "pool8"
type: "Pooling"
bottom: "pool7"
top: "pool8"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 0
}
}
layer {
name: "ctx_output1"
type: "Convolution"
bottom: "res4a_branch2b"
top: "ctx_output1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output1/bn"
type: "BatchNorm"
bottom: "ctx_output1"
top: "ctx_output1"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "ctx_output1/relu"
type: "ReLU"
bottom: "ctx_output1"
top: "ctx_output1"
}
layer {
name: "ctx_output2"
type: "Convolution"
bottom: "res5a_branch2b"
top: "ctx_output2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output2/bn"
type: "BatchNorm"
bottom: "ctx_output2"
top: "ctx_output2"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "ctx_output2/relu"
type: "ReLU"
bottom: "ctx_output2"
top: "ctx_output2"
}
layer {
name: "ctx_output3"
type: "Convolution"
bottom: "pool6"
top: "ctx_output3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output3/bn"
type: "BatchNorm"
bottom: "ctx_output3"
top: "ctx_output3"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "ctx_output3/relu"
type: "ReLU"
bottom: "ctx_output3"
top: "ctx_output3"
}
layer {
name: "ctx_output4"
type: "Convolution"
bottom: "pool7"
top: "ctx_output4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output4/bn"
type: "BatchNorm"
bottom: "ctx_output4"
top: "ctx_output4"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "ctx_output4/relu"
type: "ReLU"
bottom: "ctx_output4"
top: "ctx_output4"
}
layer {
name: "ctx_output5"
type: "Convolution"
bottom: "pool8"
top: "ctx_output5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output5/bn"
type: "BatchNorm"
bottom: "ctx_output5"
top: "ctx_output5"
batch_norm_param {
moving_average_fraction: 0.99
eps: 0.0001
#scale_bias: true
}
}
layer {
name: "ctx_output5/relu"
type: "ReLU"
bottom: "ctx_output5"
top: "ctx_output5"
}
layer {
name: "ctx_output1/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output1"
top: "ctx_output1/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output1/relu_mbox_loc_perm"
type: "Permute"
bottom: "ctx_output1/relu_mbox_loc"
top: "ctx_output1/relu_mbox_loc_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output1/relu_mbox_loc_flat"
type: "Flatten"
bottom: "ctx_output1/relu_mbox_loc_perm"
top: "ctx_output1/relu_mbox_loc_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output1/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output1"
top: "ctx_output1/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16#84
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output1/relu_mbox_conf_perm"
type: "Permute"
bottom: "ctx_output1/relu_mbox_conf"
top: "ctx_output1/relu_mbox_conf_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output1/relu_mbox_conf_flat"
type: "Flatten"
bottom: "ctx_output1/relu_mbox_conf_perm"
top: "ctx_output1/relu_mbox_conf_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output1/relu_mbox_priorbox"
type: "PriorBox"
bottom: "ctx_output1"
bottom: "data"
top: "ctx_output1/relu_mbox_priorbox"
prior_box_param {
min_size: 24 #14.72
max_size: 50.24 #36.8
aspect_ratio: 2
flip: true
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
offset: 0.5
}
}
layer {
name: "ctx_output2/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output2"
top: "ctx_output2/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output2/relu_mbox_loc_perm"
type: "Permute"
bottom: "ctx_output2/relu_mbox_loc"
top: "ctx_output2/relu_mbox_loc_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output2/relu_mbox_loc_flat"
type: "Flatten"
bottom: "ctx_output2/relu_mbox_loc_perm"
top: "ctx_output2/relu_mbox_loc_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output2/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output2"
top: "ctx_output2/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24#126
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output2/relu_mbox_conf_perm"
type: "Permute"
bottom: "ctx_output2/relu_mbox_conf"
top: "ctx_output2/relu_mbox_conf_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output2/relu_mbox_conf_flat"
type: "Flatten"
bottom: "ctx_output2/relu_mbox_conf_perm"
top: "ctx_output2/relu_mbox_conf_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output2/relu_mbox_priorbox"
type: "PriorBox"
bottom: "ctx_output2"
bottom: "data"
top: "ctx_output2/relu_mbox_priorbox"
prior_box_param {
min_size: 50.24 #36.8
max_size: 76.48 #110.4
aspect_ratio: 2
aspect_ratio: 3
flip: true
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
offset: 0.5
}
}
layer {
name: "ctx_output3/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output3"
top: "ctx_output3/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output3/relu_mbox_loc_perm"
type: "Permute"
bottom: "ctx_output3/relu_mbox_loc"
top: "ctx_output3/relu_mbox_loc_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output3/relu_mbox_loc_flat"
type: "Flatten"
bottom: "ctx_output3/relu_mbox_loc_perm"
top: "ctx_output3/relu_mbox_loc_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output3/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output3"
top: "ctx_output3/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24#126
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output3/relu_mbox_conf_perm"
type: "Permute"
bottom: "ctx_output3/relu_mbox_conf"
top: "ctx_output3/relu_mbox_conf_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output3/relu_mbox_conf_flat"
type: "Flatten"
bottom: "ctx_output3/relu_mbox_conf_perm"
top: "ctx_output3/relu_mbox_conf_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output3/relu_mbox_priorbox"
type: "PriorBox"
bottom: "ctx_output3"
bottom: "data"
top: "ctx_output3/relu_mbox_priorbox"
prior_box_param {
min_size: 76.48 #110.4
max_size: 102.72 #184.0
aspect_ratio: 2
aspect_ratio: 3
flip: true
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
offset: 0.5
}
}
layer {
name: "ctx_output4/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output4"
top: "ctx_output4/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output4/relu_mbox_loc_perm"
type: "Permute"
bottom: "ctx_output4/relu_mbox_loc"
top: "ctx_output4/relu_mbox_loc_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output4/relu_mbox_loc_flat"
type: "Flatten"
bottom: "ctx_output4/relu_mbox_loc_perm"
top: "ctx_output4/relu_mbox_loc_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output4/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output4"
top: "ctx_output4/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24#126
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output4/relu_mbox_conf_perm"
type: "Permute"
bottom: "ctx_output4/relu_mbox_conf"
top: "ctx_output4/relu_mbox_conf_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output4/relu_mbox_conf_flat"
type: "Flatten"
bottom: "ctx_output4/relu_mbox_conf_perm"
top: "ctx_output4/relu_mbox_conf_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output4/relu_mbox_priorbox"
type: "PriorBox"
bottom: "ctx_output4"
bottom: "data"
top: "ctx_output4/relu_mbox_priorbox"
prior_box_param {
min_size: 102.72 #184.0
max_size: 128.96 #257.6
aspect_ratio: 2
aspect_ratio: 3
flip: true
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
offset: 0.5
}
}
layer {
name: "ctx_output5/relu_mbox_loc"
type: "Convolution"
bottom: "ctx_output5"
top: "ctx_output5/relu_mbox_loc"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output5/relu_mbox_loc_perm"
type: "Permute"
bottom: "ctx_output5/relu_mbox_loc"
top: "ctx_output5/relu_mbox_loc_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output5/relu_mbox_loc_flat"
type: "Flatten"
bottom: "ctx_output5/relu_mbox_loc_perm"
top: "ctx_output5/relu_mbox_loc_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output5/relu_mbox_conf"
type: "Convolution"
bottom: "ctx_output5"
top: "ctx_output5/relu_mbox_conf"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16#84
bias_term: true
pad: 0
kernel_size: 1
group: 1
stride: 1
weight_filler {
type: "msra"
}
bias_filler {
type: "constant"
value: 0
}
dilation: 1
}
}
layer {
name: "ctx_output5/relu_mbox_conf_perm"
type: "Permute"
bottom: "ctx_output5/relu_mbox_conf"
top: "ctx_output5/relu_mbox_conf_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "ctx_output5/relu_mbox_conf_flat"
type: "Flatten"
bottom: "ctx_output5/relu_mbox_conf_perm"
top: "ctx_output5/relu_mbox_conf_flat"
flatten_param {
axis: 1
}
}
layer {
name: "ctx_output5/relu_mbox_priorbox"
type: "PriorBox"
bottom: "ctx_output5"
bottom: "data"
top: "ctx_output5/relu_mbox_priorbox"
prior_box_param {
min_size: 128.96 #257.6
max_size: 160 #331.2
aspect_ratio: 2
flip: true
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
offset: 0.5
}
}
layer {
name: "mbox_loc"
type: "Concat"
bottom: "ctx_output1/relu_mbox_loc_flat"
bottom: "ctx_output2/relu_mbox_loc_flat"
bottom: "ctx_output3/relu_mbox_loc_flat"
bottom: "ctx_output4/relu_mbox_loc_flat"
bottom: "ctx_output5/relu_mbox_loc_flat"
top: "mbox_loc"
concat_param {
axis: 1
}
}
layer {
name: "mbox_conf"
type: "Concat"
bottom: "ctx_output1/relu_mbox_conf_flat"
bottom: "ctx_output2/relu_mbox_conf_flat"
bottom: "ctx_output3/relu_mbox_conf_flat"
bottom: "ctx_output4/relu_mbox_conf_flat"
bottom: "ctx_output5/relu_mbox_conf_flat"
top: "mbox_conf"
concat_param {
axis: 1
}
}
layer {
name: "mbox_priorbox"
type: "Concat"
bottom: "ctx_output1/relu_mbox_priorbox"
bottom: "ctx_output2/relu_mbox_priorbox"
bottom: "ctx_output3/relu_mbox_priorbox"
bottom: "ctx_output4/relu_mbox_priorbox"
bottom: "ctx_output5/relu_mbox_priorbox"
top: "mbox_priorbox"
concat_param {
axis: 2
}
}
layer {
name: "mbox_conf_reshape"
type: "Reshape"
bottom: "mbox_conf"
top: "mbox_conf_reshape"
reshape_param {
shape {
dim: 0
dim: -1
dim: 4
}
}
}
layer {
name: "mbox_conf_softmax"
type: "Softmax"
bottom: "mbox_conf_reshape"
top: "mbox_conf_softmax"
softmax_param {
axis: 2
}
}
layer {
name: "mbox_conf_flatten"
type: "Flatten"
bottom: "mbox_conf_softmax"
top: "mbox_conf_flatten"
flatten_param {
axis: 1
}
}
import numpy as np
import sys,os
import cv2
caffe_root = '/home/caffe/'
sys.path.insert(0, caffe_root + 'python')
import caffe
import time
net_file = 'ti_bsd_5little_200_400_4pr_deploy.prototxt'
caffe_model = './snapshot/400*200_r/4pr_iter_130000.caffemodel'
test_dir = "./testjpg_r_1524/"
image_outdir = "./testjpg_r_4pr_result/"
if not os.path.exists(caffe_model):
print("MobileNetSSD_deploy.caffemodel does not exist,")
print("use merge_bn.py to generate it.")
exit()
net = caffe.Net(net_file,caffe_model,caffe.TEST)
caffe.set_mode_gpu()
caffe.set_device(0)
CLASSES = ('background',
'car', 'person', 'bicycle')
def preprocess(src):
img = cv2.resize(src, (400,200))
#img = img - 127.5
#img = img * 0.007843
return img
def postprocess(img, out):
h = img.shape[0]
w = img.shape[1]
box = out['detection_out'][0,0,:,3:7] * np.array([w, h, w, h])
cls = out['detection_out'][0,0,:,1]
conf = out['detection_out'][0,0,:,2]
return (box.astype(np.int32), conf, cls)
def detect(imgfile):
file_path,file_name = os.path.split(imgfile)
#print file_path,file_name
start = time.time()
origimg = cv2.imread(imgfile)
img = preprocess(origimg)
img = img.astype(np.float32)
img = img.transpose((2, 0, 1))
end = time.time()
print ('Read image took {:.3f}s').format( end - start)
net.blobs['data'].data[...] = img
start = time.time()
out = net.forward()
box, conf, cls = postprocess(origimg, out)
end = time.time()
print ('Detection took {:.3f}s').format( end - start)
for i in range(len(box)):
if (conf[i] > 0.3)&((str(CLASSES[int(cls[i])]) == "car")):
p1 = (box[i][0], box[i][1])
p2 = (box[i][2], box[i][3])
cv2.rectangle(origimg, p1, p2, (0,255,0))
p3 = (max(p1[0], 15), max(p1[1], 15))
title = "%s:%.2f" % (CLASSES[int(cls[i])], conf[i])
cv2.putText(origimg, title, p3, cv2.FONT_ITALIC, 0.3, (0, 255, 0), 1)
if (conf[i] > 0.3)&(str(CLASSES[int(cls[i])]) == "person"):
p1 = (box[i][0], box[i][1])
p2 = (box[i][2], box[i][3])
cv2.rectangle(origimg, p1, p2, (0,255,255))
p3 = (max(p1[0], 15), max(p1[1], 15))
title = "%s:%.2f" % (CLASSES[int(cls[i])], conf[i])
cv2.putText(origimg, title, p3, cv2.FONT_ITALIC, 0.3, (0, 255, 0), 1)
if (conf[i] > 0.3)&(str(CLASSES[int(cls[i])]) == "bicycle"):
p1 = (box[i][0], box[i][1])
p2 = (box[i][2], box[i][3])
cv2.rectangle(origimg, p1, p2, (0,255,255))
p3 = (max(p1[0], 15), max(p1[1], 15))
title = "%s:%.2f" % (CLASSES[int(cls[i])], conf[i])
cv2.putText(origimg, title, p3, cv2.FONT_ITALIC, 0.3, (0, 255, 0), 1)
#cv2.imshow("SSD", origimg)
#k = cv2.waitKey(1) & 0xff
cv2.imwrite(image_outdir + '/'+ file_name, origimg)
#Exit if ESC pressed
#if k == 27 : return False
return True
for f in sorted(os.listdir(test_dir)):
if detect(test_dir + "/" + f) == False:
break
caffe代码
c++编写的深度学习框架,关注3个目录 include/src/ tools。
Blob 是 Caffe 中处理和传递实际数据的数据封装包,并且在 CPU 与 GPU 之间具有同步处理能力。
Layer 是 Caffe 模型的本质内容和执行计算的基本单元。Layer 可以进行很多运算,如:convolve,pool,normalize,load data,softmax losses。
Net 是由一系列层组成的有向无环(DAG)计算图,Caffe 保留了计算图中所有的中间值以确保前向和反向迭代的准确性。一个典型的 Net 开始于 data layer——从磁盘中加载数据,终止于 loss layer——计算如分类和重构这些任务的目标函数。
Net::Forward()和 Net::Backward()方法实现网络的前传和后传,Solver 优化一个模型,首先通过调用前传来获得输出和损失,然后调用反传产生模型的梯度,将梯度与权值更新后相结合来最小化损失。Solver、网络和层之间的分工使得 Caffe可以模块化并且开源。
Caffe 模型的学习被分为两个部分:由 Solver 进行优化、更新参数,由 Net 计算出 loss 和 gradient。在Caffe中,损失是通过网络的前向计算得到。
Solver 通过协调 Net 的前向推断计算和反向梯度计算(forward inference and backward gradients), 来对参数进行更新,从而达到减小 loss 的目的。Caffe 模型的学习被分为两个部分:由 Solver 进行优化、更新参数,由 Net 计算出 loss 和 gradient。
为了创建一个 caffe 模型,我们需要在一个 protocol buffer(prototxt)文件中定义模型的结构。
在 caffe 中,层和相应的参数都定义在 caffe.proto 文件里Caffe 有命令行、Python 和 MATLAB 三种接口,来实现日常使用、研究代码的交互以及实现快速原型。
Caffe 中数据流以 Blobs 进行传输。数据层将输入转换为 blob 加载数据,将 blob 转换为其他格式保存输出。
模型可视化工具
http://ethereon.github.io/netscope/#/editor
shift+enter
