Analyzing model performance on nuScenes validation Set #11

sunbin1357 · 2024-12-16T11:19:34Z

Thanks for your great work. Based on your Hugging Face model, I evaluate it on the nuScenes validation set, and the metrics are as follows.

=========== F1 Score ===========


KEEP: 0.7797042325344211
ACCELERATE: 0
DECELERATE: 0.02052785923753666
STOP: 0.31730168644597123
RIGHT_TURN: 0
LEFT_TURN: 0.0787878787878788
STRAIGHT: 0.6873650107991361


================================



Total Number: 5119


Correct Number: 954


------------------------------


Planning Accuracy: 18.64%


------------------------------

KEEP_RIGHT_TURN: num: 187, correct num: 0, 0.00%
KEEP_LEFT_TURN: num: 139, correct num: 101, 72.66%
KEEP_STRAIGHT: num: 3704, correct num: 592, 15.98%
ACCELERATE_RIGHT_TURN: num: 3, correct num: 0, 0.00%
ACCELERATE_LEFT_TURN: num: 4, correct num: 0, 0.00%
ACCELERATE_STRAIGHT: num: 62, correct num: 0, 0.00%
DECELERATE_STRAIGHT: num: 45, correct num: 7, 15.56%
STOP_STRAIGHT: num: 975, correct num: 254, 26.05%

My executed command is python eval_tools/senna_plan_cmd_eval_multi_img.py
The overall evaluation is relatively low. Is it the provided model is not the optimal one, or did I make some mistakes in my operation?

The text was updated successfully, but these errors were encountered:

xiaodongww · 2024-12-18T09:55:17Z

@sunbin1357 , @rb93dett Hi, I have a simiar evaluation result. Have you figured out the reason?

=========== F1 Score ===========


KEEP: 0.7856961704286076
ACCELERATE: 0
DECELERATE: 0.021638330757341576
STOP: 0.31512868801004396
RIGHT_TURN: 0
LEFT_TURN: 0.07772795216741404
STRAIGHT: 0.6809549647314163


================================



Total Number: 5119


Correct Number: 952


------------------------------


Planning Accuracy: 18.60%


------------------------------

KEEP_RIGHT_TURN: num: 187, correct num: 0, 0.00%
KEEP_LEFT_TURN: num: 139, correct num: 101, 72.66%
KEEP_STRAIGHT: num: 3704, correct num: 593, 16.01%
ACCELERATE_RIGHT_TURN: num: 3, correct num: 0, 0.00%
ACCELERATE_LEFT_TURN: num: 4, correct num: 0, 0.00%
ACCELERATE_STRAIGHT: num: 62, correct num: 0, 0.00%
DECELERATE_STRAIGHT: num: 45, correct num: 7, 15.56%
STOP_STRAIGHT: num: 975, correct num: 251, 25.74%

Eval results saved to result/eval_result.json

rb93dett · 2024-12-23T03:45:02Z

The model weights we provide are pretrained on the DriveX dataset for scene understanding to facilitate fine-tuning and transfer to other specific datasets, it is recommended to fine-tune the model on the generated nuScenes data to obtain the best results. I will add more detailed explanation in future updates. Thank you for pointing out!

xiaodongww · 2024-12-23T07:02:40Z

@rb93dett , thanks for your replyment. Is it possible to provide the model weights fine-tuned on the nuscenes dataset. This will be helpful for benchmarking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analyzing model performance on nuScenes validation Set #11

Analyzing model performance on nuScenes validation Set #11

sunbin1357 commented Dec 16, 2024 •

edited

Loading

xiaodongww commented Dec 18, 2024

rb93dett commented Dec 23, 2024

xiaodongww commented Dec 23, 2024

Analyzing model performance on nuScenes validation Set #11

Analyzing model performance on nuScenes validation Set #11

Comments

sunbin1357 commented Dec 16, 2024 • edited Loading

xiaodongww commented Dec 18, 2024

rb93dett commented Dec 23, 2024

xiaodongww commented Dec 23, 2024

sunbin1357 commented Dec 16, 2024 •

edited

Loading