Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyzing model performance on nuScenes validation Set #11

Open
sunbin1357 opened this issue Dec 16, 2024 · 3 comments
Open

Analyzing model performance on nuScenes validation Set #11

sunbin1357 opened this issue Dec 16, 2024 · 3 comments

Comments

@sunbin1357
Copy link

sunbin1357 commented Dec 16, 2024

Thanks for your great work. Based on your Hugging Face model, I evaluate it on the nuScenes validation set, and the metrics are as follows.

=========== F1 Score ===========


KEEP: 0.7797042325344211
ACCELERATE: 0
DECELERATE: 0.02052785923753666
STOP: 0.31730168644597123
RIGHT_TURN: 0
LEFT_TURN: 0.0787878787878788
STRAIGHT: 0.6873650107991361


================================



Total Number: 5119


Correct Number: 954


------------------------------


Planning Accuracy: 18.64%


------------------------------

KEEP_RIGHT_TURN: num: 187, correct num: 0, 0.00%
KEEP_LEFT_TURN: num: 139, correct num: 101, 72.66%
KEEP_STRAIGHT: num: 3704, correct num: 592, 15.98%
ACCELERATE_RIGHT_TURN: num: 3, correct num: 0, 0.00%
ACCELERATE_LEFT_TURN: num: 4, correct num: 0, 0.00%
ACCELERATE_STRAIGHT: num: 62, correct num: 0, 0.00%
DECELERATE_STRAIGHT: num: 45, correct num: 7, 15.56%
STOP_STRAIGHT: num: 975, correct num: 254, 26.05%

My executed command is python eval_tools/senna_plan_cmd_eval_multi_img.py
The overall evaluation is relatively low. Is it the provided model is not the optimal one, or did I make some mistakes in my operation?

@xiaodongww
Copy link

@sunbin1357 , @rb93dett Hi, I have a simiar evaluation result. Have you figured out the reason?

=========== F1 Score ===========


KEEP: 0.7856961704286076
ACCELERATE: 0
DECELERATE: 0.021638330757341576
STOP: 0.31512868801004396
RIGHT_TURN: 0
LEFT_TURN: 0.07772795216741404
STRAIGHT: 0.6809549647314163


================================



Total Number: 5119


Correct Number: 952


------------------------------


Planning Accuracy: 18.60%


------------------------------

KEEP_RIGHT_TURN: num: 187, correct num: 0, 0.00%
KEEP_LEFT_TURN: num: 139, correct num: 101, 72.66%
KEEP_STRAIGHT: num: 3704, correct num: 593, 16.01%
ACCELERATE_RIGHT_TURN: num: 3, correct num: 0, 0.00%
ACCELERATE_LEFT_TURN: num: 4, correct num: 0, 0.00%
ACCELERATE_STRAIGHT: num: 62, correct num: 0, 0.00%
DECELERATE_STRAIGHT: num: 45, correct num: 7, 15.56%
STOP_STRAIGHT: num: 975, correct num: 251, 25.74%

Eval results saved to result/eval_result.json

@rb93dett
Copy link
Collaborator

The model weights we provide are pretrained on the DriveX dataset for scene understanding to facilitate fine-tuning and transfer to other specific datasets, it is recommended to fine-tune the model on the generated nuScenes data to obtain the best results. I will add more detailed explanation in future updates. Thank you for pointing out!

@xiaodongww
Copy link

@rb93dett , thanks for your replyment. Is it possible to provide the model weights fine-tuned on the nuscenes dataset. This will be helpful for benchmarking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants