Mainbody Detection

The mainbody detection technology is currently a very widely used detection technology, which refers to the detect one or some mainbody objects in the picture, crop the corresponding area in the image and carry out recognition, thereby completing the entire recognition process. Mainbody detection is the first step of the recognition task, which can effectively improve the recognition accuracy.

This tutorial will introduce the dataset and model training for mainbody detection in PaddleClas.

1. Dataset

The datasets we used for mainbody detection task are shown in the following table.

Dataset	Image number	Image number used in < >mainbody detection	Scenarios	Dataset link
Objects365	170W	6k	General Scenarios	link
COCO2017	12W	5k	General Scenarios	link
iCartoonFace	2k	2k	Cartoon Face	link
LogoDet-3k	3k	2k	Logo	link
RPC	3k	3k	Product	link

In the actual training process, all datasets are mixed together. Categories of all the labeled boxes are modified to the category foreground, and the detection model we trained just contains one category (foreground).

2. Model Selection

There are many types of object detection methods such as the commonly used two-stage detectors (FasterRCNN series, etc.), single-stage detectors (YOLO, SSD, etc.), anchor-free detectors (FCOS, etc.) and so on.

PP-YOLO is proposed by PaddleDetection. It deeply optimizes the yolov3 model from multiple perspectives such as backbone, data augmentation, regularization strategy, loss function, and post-processing. Finally, it reached the state of the art in terms of "speed-precision". Specifically, the optimization strategy is as follows.

Better backbone: ResNet50vd-DCN
Larger training batch size: 8 GPUs and mini-batch size as 24 on each GPU
Drop Block
Exponential Moving Average
IoU Loss
Grid Sensitive
Matrix NMS
CoordConv
Spatial Pyramid Pooling
Better ImageNet pretrain weights

For more information about PP-YOLO, you can refer to PP-YOLO tutorial

In the mainbody detection task, we use ResNet50vd-DCN as our backbone for better performance. The config file is ppyolov2_r50vd_dcn_365e_coco.yml used for the model training, in which the dagtaset path is modified to the mainbody detection dataset. The final inference model can be downloaded here.

3. Model training

This section mainly talks about how to train your own mainbody detection model using PaddleDetection on your own dataset.

3.1 Prepare for the environment

Download PaddleDetection and install requirements。

cd <path/to/clone/PaddleDetection>
git clone https://github.com/PaddlePaddle/PaddleDetection.git

cd PaddleDetection
# install requirements
pip install -r requirements.txt

For more installation tutorials, please refer to Installation tutorial

3.2 Prepare for the dataset

For customized dataset, you should convert it to COCO format. Please refer to Customized dataset tutorial to build your own dataset with COCO format.

In mainbody detection task, all the objects belong to foregroud. Therefore, category_id of all the objects in the annotation file should be modified to 1. And the categories map should be modified as follows, in which just class foregroud is included.

[{u'id': 1, u'name': u'foreground', u'supercategory': u'foreground'}]

3.3 Configuration files

You can use configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml to train the model, mode details are as follows.

ppyolov2_r50vd_dcn_365e_coco.yml depends on other configuration files, their meanings are as follows.

coco_detection.yml：num_class of the model, and train/eval/test dataset.

runtime.yml：public runtime parameters, use_gpu, save_interval, etc.

optimizer_365e.yml：learning rate and optimizer.

ppyolov2_r50vd_dcn.yml：model architecture.

ppyolov2_reader.yml：train/eval/test reader.

In mainbody detection task, you need to modify num_classes in datasets/coco_detection.yml to 1 (just foreground is included). Dataset path should also be updated.

3.4 Begin the training process

PaddleDetection supports many ways of training process.

Training using single GPU

# not needed for windows and Mac
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml

Training using multiple GPU's

export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval

--eval：eval during training

(Recommend) Model finetune If you want to finetune the model on your own dataset, you can run the following command to train the model.

export CUDA_VISIBLE_DEVICES=0
# assign pretrain_weights, load the general mainbody-detection pretrained model
python tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml -o pretrain_weights=https://paddle-imagenet-models-name.bj.bcebos.com/dygraph/rec/models/pretrain/ppyolov2_r50vd_dcn_mainbody_v1.0_pretrained.pdparams

Resume training: you can use -r to load checkpoints and resume training.

export CUDA_VISIBLE_DEVICES=0,1,2,3
python -m paddle.distributed.launch --gpus 0,1,2,3 tools/train.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --eval -r output/ppyolov2_r50vd_dcn_365e_coco/10000

Note: If error out of memory occured, you can try to decrease batch_size in ppyolov2_reader.yml.

3.5 Model prediction

Use the following command to finish the prediction process.

export CUDA_VISIBLE_DEVICES=0
python tools/infer.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --infer_img=your_image_path.jpg --output_dir=infer_output/ --draw_threshold=0.5 -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final

--draw_threshold is an optional parameter.

3.6 Export model and inference.

Use the following to export the inference model.

python tools/export_model.py -c configs/ppyolo/ppyolov2_r50vd_dcn_365e_coco.yml --output_dir=./inference -o weights=output/ppyolov2_r50vd_dcn_365e_coco/model_final.pdparams

The inference model will be saved folder inference/ppyolov2_r50vd_dcn_365e_coco, which contains model.pdiparams, model.pdiparams.info,model.pdmodel and infer_cfg.yml(optional for mainbody detection).

Note: Inference model name that PaddleDetection exports is model.xxx, here if you want to keep it consistent with PaddleClas, you can rename model.xxx to inference.xxx for subsequent inference.

For more model export tutorial, please refer to EXPORT_MODEL.

Now you get the newest model on your own dataset. In the recognition process, you can replace the detection model path with yours. For quick start of recognition process, please refer to the tutorial.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mainbody_detection_en.md

mainbody_detection_en.md

Mainbody Detection

1. Dataset

2. Model Selection

3. Model training

3.1 Prepare for the environment

3.2 Prepare for the dataset

3.3 Configuration files

3.4 Begin the training process

3.5 Model prediction

3.6 Export model and inference.

Files

mainbody_detection_en.md

Latest commit

History

mainbody_detection_en.md

File metadata and controls

Mainbody Detection

1. Dataset

2. Model Selection

3. Model training

3.1 Prepare for the environment

3.2 Prepare for the dataset

3.3 Configuration files

3.4 Begin the training process

3.5 Model prediction

3.6 Export model and inference.