The different setting methods of YOLOV5 training labels have a great impact on the test results. Why???? #7370

Peter-Pan-GitHub · 2022-04-11T03:47:28Z

Peter-Pan-GitHub
Apr 11, 2022

There are 400 pictures, each picture has 5 targets, that is, a total of 2000 labels. I used the following two methods to train, and after many training and test comparisons, I found that the test results of these two methods are very different:
The first way: 400 jpg image files + 400 txt tag files, that is, each txt tag file corresponds to an image, and each txt file contains 5 tags
The second way: 2000 jpg image files + 2000 txt label files, that is, I copy each picture into 5 copies, and rename them, and then generate 2000 txt label files correspondingly, each txt label file only label one target object for each picture.

Please do not consider it's superfluous, my actual request is like this. What I understand is that these two training methods should have similar effects, and there is not much difference between the conclusions. Why is there a big difference in the actual test? ? ? ?

The test effect of the first method is much better than the second. The first way: P, R, and MAP are all high; the second way: only R is high, and both P and MAP are low. When using the second method for testing, there is almost no target frame. Even if there is a target frame, there is only one target frame, and the accuracy rate is very low; But the first method will frame all the targets, and the accurate rate is also very high.

glenn-jocher · 2022-04-11T08:33:34Z

glenn-jocher
Apr 11, 2022
Maintainer

@Peter-Pan-GitHub 👋 Hello! Thanks for asking about YOLOv5 🚀 dataset formatting. To train correctly your data must be in YOLOv5 format. Please see our Train Custom Data tutorial for full documentation on dataset setup and all steps required to start training your first model. A few excerpts from the tutorial:

1.1 Create dataset.yaml

COCO128 is an example small tutorial dataset composed of the first 128 images in COCO train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. data/coco128.yaml, shown below, is the dataset config file that defines 1) the dataset root directory path and relative paths to train / val / test image directories (or *.txt files with image paths), 2) the number of classes nc and 3) a list of class names:

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco128  # dataset root dir
train: images/train2017  # train images (relative to 'path') 128 images
val: images/train2017  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes
nc: 80  # number of classes
names: [ 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
         'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
         'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
         'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
         'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
         'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
         'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
         'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
         'hair drier', 'toothbrush' ]  # class names

1.2 Create Labels

After using a tool like Roboflow Annotate to label your images, export your labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required). The *.txt file specifications are:

One row per object
Each row is class x_center y_center width height format.
Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
Class numbers are zero-indexed (start from 0).

The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27):

1.3 Organize Directories

Organize your train and val images and labels according to the example below. YOLOv5 assumes /coco128 is inside a /datasets directory next to the /yolov5 directory. YOLOv5 locates labels automatically for each image by replacing the last instance of /images/ in each image path with /labels/. For example:

../datasets/coco128/images/im0.jpg  # image
../datasets/coco128/labels/im0.txt  # label

Good luck 🍀 and let us know if you have any other questions!

0 replies

oddmanru · 2023-02-23T01:30:28Z

oddmanru
Feb 23, 2023

In my view, I prefer to use the first method that you have mentioned above as the object detector generalized better under such a circumstance than the latter. The training of classification with multiple labels typically converges better than do it with a single label. Moreover, a poor skill of annotation also is deemed as a factor that might lead to lower accuracy because the content of background might be taken into consideration during the training process.

0 replies

braindevices · 2023-02-26T00:20:40Z

braindevices
Feb 26, 2023

of course, you get this result. Basically Your 2nd way to prepare the training set introducing lots of contradictions.

The yolo learns the image like a sliding window. You can think of an example you have exactly 2 objects in each image, with equally sized. Basically you have 1 positive example and 1 negative example per each image per class. Your 1st way is correct.

However, your 2nd way actually give 2 conflicted result. Say in copy1 you have 1 positive example for object A, but you then label the B as negative. in copy2 you reversed it. Thus you feed into the mode 2 exact same examples per class, but you label them as 1 negative and 1 positive. Of course you model is going to learn nothing about it.

3 replies

oddmanru Feb 26, 2023

Theoretically, The second way that made 5 copies for each target is equivalent to the way that crops each ROI and get them labeled in the same image one after another. The issue that the host was having is that he/she found the training result is different than the result done in the first method, which has trianed all rois together. Apparently, it is not a binary classification but multi-class problem. The issue is why the result is different when the rois are training altogehter than when they are training individually. This is what I understood.

healthmatrice Feb 26, 2023

no you understand it wrong. He did not crop. He simple draw bbox in copy1 for A, bbox in copy2 for B ... this is conflicted training sample equal to binary noise as @braindevices mentioned.

oddmanru Feb 27, 2023

Unless Yolov5 has any special approch that is doing against this way, in my view, drawing or cropping one target's bbox in each duplicated images won't generate binary noise which typically was caused by random annotations. The second method that the host has done should give the net some useful information to learn since the targets were annoated and labeled. But I agree that annotating and labeling one target each on the duplicated images is not a good pratice for obtaining an accurate model easily. Maybe the host need more epochs to train or redo the annoation on each image, this method might also cause overfitting problem which generates a biased model. On the other hand, this method can be deemed as a kind of transformation that pushes the net to weigh in different weights on the same targets, which is totally not necessary. It is an interesting question but might not be worth it to try.

ManosMpampis · 2023-08-24T10:25:33Z

ManosMpampis
Aug 24, 2023

As long as I know, all fully convolutional network architectures work with an assigner to assign the labels in the correct feature at the last feature map of the network. To be balance the assigner must have both positive and negative examples in each input. So, in Yolo all anchors that have a big enough IoU with ROI is labeled are positives or class-i examples and all others as negative or background examples.
In 2nd way, he/she correctly give the positive examples each time but the assigner saw all other objects (that were not annotated at the given input) as negative - background examples. So it learned both that it wasn't any class (in 4 of the 5 inputs) and that it is a specific class (in 1 of the 5). Models can not understand that they see the same input and add all corresponding labels into the equation. The only way to train a model with these labels was that to add a weight to the positive examples *5. This weight must be at the loss calculation and help the model to "walk" 5 steps into the correct way and then 4 to the wrong way, when the model see a positive example. This is also bad in a lot situations because you do not know how many anchors saw each annotation but it will be better.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The different setting methods of YOLOV5 training labels have a great impact on the test results. Why???? #7370

{{title}}

Replies: 4 comments 3 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

The different setting methods of YOLOV5 training labels have a great impact on the test results. Why???? #7370

Peter-Pan-GitHub Apr 11, 2022

Replies: 4 comments · 3 replies

glenn-jocher Apr 11, 2022 Maintainer

1.1 Create dataset.yaml

1.2 Create Labels

1.3 Organize Directories

oddmanru Feb 23, 2023

braindevices Feb 26, 2023

oddmanru Feb 26, 2023

healthmatrice Feb 26, 2023

oddmanru Feb 27, 2023

ManosMpampis Aug 24, 2023

Peter-Pan-GitHub
Apr 11, 2022

Replies: 4 comments 3 replies

glenn-jocher
Apr 11, 2022
Maintainer

oddmanru
Feb 23, 2023

braindevices
Feb 26, 2023

ManosMpampis
Aug 24, 2023