-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Analyzing model performance on nuScenes validation Set #11
Comments
@sunbin1357 , @rb93dett Hi, I have a simiar evaluation result. Have you figured out the reason? =========== F1 Score ===========
KEEP: 0.7856961704286076
ACCELERATE: 0
DECELERATE: 0.021638330757341576
STOP: 0.31512868801004396
RIGHT_TURN: 0
LEFT_TURN: 0.07772795216741404
STRAIGHT: 0.6809549647314163
================================
Total Number: 5119
Correct Number: 952
------------------------------
Planning Accuracy: 18.60%
------------------------------
KEEP_RIGHT_TURN: num: 187, correct num: 0, 0.00%
KEEP_LEFT_TURN: num: 139, correct num: 101, 72.66%
KEEP_STRAIGHT: num: 3704, correct num: 593, 16.01%
ACCELERATE_RIGHT_TURN: num: 3, correct num: 0, 0.00%
ACCELERATE_LEFT_TURN: num: 4, correct num: 0, 0.00%
ACCELERATE_STRAIGHT: num: 62, correct num: 0, 0.00%
DECELERATE_STRAIGHT: num: 45, correct num: 7, 15.56%
STOP_STRAIGHT: num: 975, correct num: 251, 25.74%
Eval results saved to result/eval_result.json |
The model weights we provide are pretrained on the DriveX dataset for scene understanding to facilitate fine-tuning and transfer to other specific datasets, it is recommended to fine-tune the model on the generated nuScenes data to obtain the best results. I will add more detailed explanation in future updates. Thank you for pointing out! |
@rb93dett , thanks for your replyment. Is it possible to provide the model weights fine-tuned on the nuscenes dataset. This will be helpful for benchmarking. |
Thanks for your great work. Based on your Hugging Face model, I evaluate it on the nuScenes validation set, and the metrics are as follows.
My executed command is
python eval_tools/senna_plan_cmd_eval_multi_img.py
The overall evaluation is relatively low. Is it the provided model is not the optimal one, or did I make some mistakes in my operation?
The text was updated successfully, but these errors were encountered: