Dataset pre-processing: Processes MCAP bag files to generate data compatible with the SegmentsAI annotation tool.
To upload and add sample data to SegmentsAI, you will need access key tokens.
Create a file named dataset_keys.env
inside a keys
directory in the parent directory of this repository:
mkdir -p keys && touch keys/dataset_keys.env
Add the following environment variables to dataset_keys.env
:
# EIDF AWS S3
AWS_ACCESS_KEY_ID=my_access_key_id
AWS_SECRET_ACCESS_KEY=my_secret_access_key
AWS_ENDPOINT_URL=my_s3_organisation_url
BUCKET_NAME=my_bucket_name
EIDF_PROJECT_NAME=my_projectxyz
# SegmentsAI key
SEGMENTS_API_KEY=my_segment_ai_api_key
- File and path names are case-sensitive.
- The
dev.sh
script will attempt to locate thedataset_keys.env
file. If the file is missing or incorrectly named, the script will throw an error.
For access credentials, please contact Hector Cruz or Alejandro Bordallo.
To build and run the Docker container interactively, use:
./dev.sh -l -p /PATH/TO/ROSBAGS
- Edit
config/av.yaml
to definebag_path
andoutput_dir
. - For further reference, check exporter_config.yaml.
- Run the following command to extract data:
ros2 run ros2_bag_exporter bag_exporter --ros-args -p config_file:="config/av.yaml"
The extractor will create a directory named after the provided rosbag inside the output_dir
directory. This directory will contain:
- Extracted point clouds (
.pcd
) - Images (
.jpg
) export_metadata.yaml
We will refer to this directory as <rosbag_output_dir>
.
To extract the ego trajectory:
python3 -m scripts.generate_ego_trajectory.py <my_path_to_rosbag.mcap> <rosbag_output_dir>
A .tum
file with the same name as your rosbag should appear in <rosbag_output_dir>
.
To upload the extracted data to either EIDF or SegmentsAI AWS S3, run:
python3 -m scripts.upload <rosbag_output_dir> eidf
# Or
python3 -m scripts.upload <rosbag_output_dir> segments
If no S3 organisation is specified, eidf
is used by default.
After the upload, you should see an upload_metadata.json
file inside <rosbag_output_dir>
.
Create a dataset if you haven't already and extract its name.
Run the script:
python3 -m scripts.add_3d_samples.py <my_dataset_name> <sequence_name> <rosbag_output_dir>
Where:
<my_dataset_name>
: Segment.ai's dataset name<sequence_name>
: Desired sequence name for the 3D sample- Ensure the sequence name is unique within your dataset; otherwise, the sample will not be uploaded
<rosbag_output_dir>
: Directory with the extracted rosbags and metadata files
If successful, you will see your new segment inside your dataset.