Privately Owned Vehicle Work Group Meeting - 2025/03/03 - Slot 1 #5832
m-zain-khawaja
started this conversation in
Working group meetings
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Agenda
Scene3D training
@m-zain-khawaja :
The fourth phase Scene3D network suffered from over-fitting on the training data, and although performance was good for KITTI and DDAD datasets, the network produced artefacts in the depth estimation of 'in the wild' images for road scenes as captured by dashcams. Based on further investigation, the likely cause of this was the simulator to real gap induced by relying upon a mixture of simulator images. To remedy this, the network should be trained only on real world data. However, the limitation with real world data is that it is imprecise and noisy. Therefore, the Scene3D network can be trained in a way to have a best of both worlds approach:
Create a diverse dataset of only real-world images with relative depth estimation ground truth from SOTA model such as DepthAnythingV2 and perform model distillation to have a fine-tuned, robust and generalizable relative depth estimator which works reliably across diverse real-world scenes and corner case scenarios.
Use real-world datasets with ground-truth LIDAR to fine-tune the relative depth estimator model - at this stage it would be worthwhile to consider whether sparse supervision can be included for example, from RADAR to help the network understand the true scene scale at real time.
Last week, I created a first version of a diverse real-world dataset, target is to have over 250K diverse images, I have currently processed 125K images to run a proof-of-concept test and refactored the code to be able to train relative depth.
The Scene3D network is showing very robust generalization capabilities after only 2 epochs of training in the half-size real-world dataset as shown in the test examples below, validating this approach:
Full diverse dataset preparation should be ready this week, with the full dataset network training also expected to begin this week and completion expected by end of next week.
It is planned for the diverse dataset to include data from:
Ego Path Network Design
@m-zain-khawaja has completed the Path Context block and EgoPath Head block to predict a Bezier curve for the EgoPath. The proposed loss function is as follows:
Loss = alpha * mAE_endpoints + beta * mAE_derivatives
mAE_endpoints = abs(P1 - P1*) + abs(P4 - P4*), where P1, P4 are predictions of the x,y coordinates of the start and end control points of the cubic Bezier curve defining the Ego Path, and P1* and P4* are ground truth values
mAE_derivates = (1/25) * SUM [t=0 -> t=25] ( Pd|t - Pd*|t), where Pd|t is the derivative at value t along the curve as defined by the predicted control points, and P*d|t is the derivative at value t for the ground truth curve as defined by the ground truth control points
EgoPath Training Tasks
@TranHuuNhatHuy to update augmentations.py to support keypoints required for EgoPath and load_ego_path.py data loader class to load data for training the network
Mahmoud Dahmani to implement ego_path_trainer.py class responsible for running the model, calculating the loss, updating gradients and logging results/visualizations
@m-zain-khawaja to write the train_ego_path.py script responsible for the main training loop
@TranHuuNhatHuy, @sarun-hub, @devang-marvania, Mahmoud Damani, Augustine Osarogiagbon and @m-zain-khawaja will run various versions of the EgoPath network, experimenting with loss, learning rate, batch size and data mixing
EgoPath Dataset Curation Update
@TranHuuNhatHuy - had a meeting with @m-zain-khawaja to review, parsed dataset is good, auditing in progress and upload to be completed soon.
@docjag - small request to merge latest upstream commits and re-submit PR for ROADWorks dataset since folder structure has changed
Dataset curation tracking
EgoLanes Dataset Curation Update
@aadarshkt to share update on BDD100K dataset parsing
@devang-marvania to take on CurveLane dataset parsing for EgoLanes network
Attendees
Zoom Meeting Video Recording
Video Meeting Link
Please contact the work group lead (@m-zain-khawaja) to request access to a recording of this meeting.
Beta Was this translation helpful? Give feedback.
All reactions