Configuration Instruction

Introdction

The parameters in the PaddleClas configuration file(ppcls/configs/*.yaml)are described for you to customize or modify the hyperparameter configuration more quickly.

Details

1. Classification model

Here the configuration of ResNet50_vd onImageNet-1kis used as an example to explain the each parameter in detail. Configure Path.

1.1Global Configuration

Parameter name	Specific meaning	Defult value	Optional value
checkpoints	Breakpoint model path for resuming training	null	str
pretrained_model	Pre-trained model path	null	str
output_dir	Save model path	"./output/"	str
save_interval	How many epochs to save the model at each interval	1	int
eval_during_train	Whether to evaluate at training	True	bool
eval_interval	How many epochs to evaluate at each interval	1	int
epochs	Total number of epochs in training		int
print_batch_step	How many mini-batches to print out at each interval	10	int
use_visualdl	Whether to visualize the training process with visualdl	False	bool
image_shape	Image size	[3，224，224]	list, shape: (3,)
save_inference_dir	Inference model save path	"./inference"	str
eval_mode	Model of eval	"classification"	"retrieval"

Note：The http address of pre-trained model can be filled in the pretrained_model

1.2 Architecture

Parameter name	Specific meaning	Defult value	Optional value
name	Model Arch name	ResNet50	PaddleClas model arch
class_num	Category number	1000	int
pretrained	Pre-trained model	False	bool， str

Note: Here pretrained can be set to True or False, so does the path of the weights. In addition, the pretrained is disabled when Global.pretrained_model is also set to the corresponding path.

1.3 Loss function

Parameter name	Specific meaning	Defult value	Optional value
CELoss	cross-entropy loss function	——	——
CELoss.weight	The weight of CELoss in the whole Loss	1.0	float
CELoss.epsilon	The epsilon value of label_smooth in CELoss	0.1	float，between 0 and 1

1.4 Optimizer

Parameter name	Specific meaning	Defult value	Optional value
name	optimizer method name	"Momentum"	Other optimizer including "RmsProp"
momentum	momentum value	0.9	float
lr.name	method of dropping learning rate	"Cosine"	Other dropping methods of "Linear" and "Piecewise"
lr.learning_rate	initial value of learning rate	0.1	float
lr.warmup_epoch	warmup rounds	0	int，such as 5
regularizer.name	regularization method name	"L2"	["L1", "L2"]
regularizer.coeff	regularization factor	0.00007	float

Note：The new parameters may be different when lr.name is different , as when lr.name=Piecewise, the following parameters need to be added:

  lr:
    name: Piecewise
    learning_rate: 0.1
    decay_epochs: [30, 60, 90]
    values: [0.1, 0.01, 0.001, 0.0001]

Referring to learning_rate.py for adding method and parameters.

1.5 Data reading module（DataLoader）

1.5.1 dataset

Parameter name	Specific meaning	Defult value	Optional value
name	The name of the class to read the data	ImageNetDataset	VeriWild and other Dataet type
image_root	The path where the dataset is stored	./dataset/ILSVRC2012/	str
cls_label_path	data label list	./dataset/ILSVRC2012/train_list.txt	str
transform_ops	data preprocessing for single images	——	——
batch_transform_ops	Data preprocessing for batch images	——	——

The parameter meaning of transform_ops:

Function name	Parameter name	Specific meaning
DecodeImage	to_rgb	data to RGB
	channel_first	image data by CHW
RandCropImage	size	Random crop
RandFlipImage		Random flip
NormalizeImage	scale	Normalize scale value
	mean	Normalize mean value
	std	normalized variance
	order	Normalize order
CropImage	size	crop size
ResizeImage	resize_short	resize by short edge

The parameter meaning of batch_transform_ops:

Function name	Parameter name	Specific meaning
MixupOperator	alpha	Mixup parameter value，the larger the value, the stronger the augment

1.5.2 sampler

Parameter name	Specific meaning	Default value	Optional value
name	sampler type	DistributedBatchSampler	DistributedRandomIdentitySampler and other Sampler
batch_size	batch size	64	int
drop_last	Whether to drop the last data that does reach the batch-size	False	bool
shuffle	whether to shuffle the data	True	bool

1.5.3 loader

Parameter name	Specific meaning	Default meaning	Optional meaning
num_workers	Number of data read threads	4	int
use_shared_memory	Whether to use shared memory	True	bool

1.6 Evaluation metric

Parameter name	Specific meaning	Default meaning	Optional meaning
TopkAcc	TopkAcc	[1, 5]	list, int

1.7 Inference

Parameter name	Specific meaning	Default meaning	Optional meaning
infer_imgs	Image address to be inferred	docs/images/whl/demo.jpg	str
batch_size	batch size	10	int
PostProcess.name	Post-process name	Topk	str
PostProcess.topk	topk value	5	int
PostProcess.class_id_map_file	mapping file of class id and name	ppcls/utils/imagenet1k_label_list.txt	str

Note：The interpretation of transforms in the Infer module refers to the interpretation oftransform_opsin the dataset in the data reading module.

2.Distillation model

Note：Here the training configuration of MobileNetV3_large_x1_0 on ImageNet-1k distilled MobileNetV3_small_x1_0 is used as an example to explain the meaning of each parameter in detail. Configure path. Only parameters that are distinct from the classification model are introduced here.

2.1 Architecture

Parameter name	Specific meaning	Default meaning	Optional meaning
name	model arch name	DistillationModel	——
class_num	category number	1000	int
freeze_params_list	freeze_params_list	[True, False]	list
models	model list	[Teacher, Student]	list
Teacher.name	teacher model name	MobileNetV3_large_x1_0	PaddleClas model
Teacher.pretrained	teacher model pre-trained weights	True	Boolean or pre-trained weight path
Teacher.use_ssld	whether teacher model pretrained weights are ssld weights	True	Boolean
infer_model_name	type of the model being inferred	Student	Teacher

Note：

list is represented in yaml as follows:

  freeze_params_list:
  - True
  - False

2.Student's parameters are similar and will not be repeated.

2.2 Loss function

Parameter name	Specific meaning	Default meaning	Optional meaning
DistillationCELoss	Distillation's cross-entropy loss function	——	——
DistillationCELoss.weight	Loss weight	1.0	float
DistillationCELoss.model_name_pairs	["Student", "Teacher"]	——	——
DistillationGTCELoss.weight	Distillation's cross-entropy loss function of model and true Label	——	——
DistillationGTCELos.weight	Loss weight	1.0	float
DistillationCELoss.model_names	Model names with real label for cross-entropy	["Student"]	——

2.3 Evaluation metric

Parameter name	Specific meaning	Default meaning	Optional meaning
DistillationTopkAcc	DistillationTopkAcc	including model_key and topk	——
DistillationTopkAcc.model_key	the evaluated model	"Student"	"Teacher"
DistillationTopkAcc.topk	Topk value	[1, 5]	list, int

Note： DistillationTopkAcc has the same meaning as TopkAcc, except that it is only used in distillation tasks.

3. Recognition model

Note：The training configuration ofResNet50 onLogoDet-3k is used here as an example to explain the meaning of each parameter in detail. configure path. Only parameters that are distinct from the classification model are presented here.

3.1 Architechture

Parameter name	Specific meaning	Default meaning	Optional meaning
name	Model arch	"RecModel"	["RecModel"]
infer_output_key	inference output value	“feature”	["feature", "logits"]
infer_add_softmax	softmaxwhether to add softmax to infercne	False	[True, False]
Backbone.name	Backbone name	ResNet50_last_stage_stride1	other backbone provided by PaddleClas
Backbone.pretrained	Backbone pre-trained model	True	Boolean value or pre-trained model path
BackboneStopLayer.name	The name of the output layer in Backbone	True	The`full_name`of the feature output layer in Backbone
Neck.name	The name of the Neck part	VehicleNeck	the dictionary structure to be passed in, the specific input parameters for the Neck network layer
Neck.in_channels	Input dimension size of the Neck part	2048	the size is the same as BackboneStopLayer.name
Neck.out_channels	Output the dimension size of the Neck part, i.e. feature dimension size	512	int
Head.name	Network Head part nam	CircleMargin	Arcmargin. Etc
Head.embedding_size	Feature dimension size	512	Consistent with Neck.out_channels
Head.class_num	number of classes	3000	int
Head.margin	margin value in CircleMargin	0.35	float
Head.scale	scale value in CircleMargin	64	int

Note：

1.In PaddleClas, the Neck part is the connection part between Backbone and embedding layer, and Head part is the connection part between embedding layer and classification layer.。

2.BackboneStopLayer.name can be obtained by visualizing the model, visualization can be referred to Netron or visualdl.

3.Calling tools/export_model.py will convert the model weights to inference model, where the infer_add_softmax parameter will control whether to add the Softmax activation function afterwards, the code default is True (the last output layer in the classification task will be connected to the Softmax activation function). In the recognition task, the activation function is not required for the feature layer, so it should be set to False here.

3.2 Evaluation metric

Parameter name	Specific meaning	Default meaning	Optional meaning
Recallk	Recall rate	[1, 5]	list, int
mAP	Average retrieval precision	None	None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config_description_en.md

config_description_en.md

Configuration Instruction

Introdction

Details

1. Classification model

1.1Global Configuration

1.2 Architecture

1.3 Loss function

1.4 Optimizer

1.5 Data reading module（DataLoader）

1.5.1 dataset

1.5.2 sampler

1.5.3 loader

1.6 Evaluation metric

1.7 Inference

2.Distillation model

2.1 Architecture

2.2 Loss function

2.3 Evaluation metric

3. Recognition model

3.1 Architechture

3.2 Evaluation metric

Files

config_description_en.md

Latest commit

History

config_description_en.md

File metadata and controls

Configuration Instruction

Introdction

Details

1. Classification model

1.1Global Configuration

1.2 Architecture

1.3 Loss function

1.4 Optimizer

1.5 Data reading module（DataLoader）

1.5.1 dataset

1.5.2 sampler

1.5.3 loader

1.6 Evaluation metric

1.7 Inference

2.Distillation model

2.1 Architecture

2.2 Loss function

2.3 Evaluation metric

3. Recognition model

3.1 Architechture

3.2 Evaluation metric