What is the conversion relationship between the annotations in BasicAI LiDAR Fusion format and Kitti 3D Object Detection format? #308

gravity-123 · 2024-12-25T13:45:11Z

I want to make a kitti-like dataset without images, for which I studied the conversion of annotations from BasicAI LiDAR Fusion format to Kitti 3D Object Detection format.
I used the LiDAR without any configs dataset provided in this URL https://docs.basic.ai/docs/upload-data and annotated it. The annotation example is shown in the figure below.

When I exported, I saved the BasicAI LiDAR Fusion format and Kitti 3D Object Detection format respectively. I compared the annotations of the 3d bounding box in the json file and the txt file. As shown in the figure, the size of the box is the same, except for the order. How is the center position obtained?

Is it obtained by txt_position = camera_external * json_position? But the result is wrong when calculated in this way. I read the description of camera extrinsic parameters on the webpage https://docs.basic.ai/docs/camera-intrinsic-extrinsic-and-distortion-in-camera-calibration#practice-in-basicai-, as shown in the figure. The order of the extrinsic parameters provided in the above website seems to be different.

The figure below shows the parameters corresponding to the above set of data. This parameter is the second extrinsic parameter in LiDAR_Fusion_without_any_configs/camera_config/08.json.

So how should the position conversion from BasicAI LiDAR Fusion format to Kitti 3D Object Detection format be calculated? Is my calculation formula wrong, or is there a problem with the camera extrinsic parameters?

ShirleySprite · 2025-01-17T12:04:40Z

Hello,

The KITTI format requires camera parameters because the coordinates in the KITTI format consist of 2D boxes on the image and 3D boxes in the camera coordinate system. Please note that it is the camera coordinate system, not the point cloud coordinate system. Moreover, the representation of these 3D coordinates is specific to the KITTI format. The specific conversion code is as follows:

import numpy as np

def alpha_in_pi(a):
    pi = math.pi
    return a - math.floor((a + pi) / (2 * pi)) * 2 * pi

def gen_alpha(rz, ext_matrix, lidar_center):
    lidar_center = np.hstack((lidar_center, np.array([1])))
    cam_point = ext_matrix @ np.array([np.cos(rz), np.sin(rz), 0, 1])
    cam_point_0 = ext_matrix @ np.array([0, 0, 0, 1])
    ry = -1 * (alpha_in_pi(np.arctan2(cam_point[2] - cam_point_0[2], cam_point[0] - cam_point_0[0])))
    cam_center = ext_matrix @ lidar_center.T
    theta = alpha_in_pi(np.arctan2(cam_center[0], cam_center[2]))
    alpha = ry - theta
    return ry, alpha

# Assuming obj_3d, rect, contour_3d are defined elsewhere with appropriate data.
contour_3d = obj_3d[rect['trackId']]['contour']
length, width, height = contour_3d['size3D'].values()
cur_rz = contour_3d['rotation3D']['z']

ry, alpha = gen_alpha(cur_rz, ext_matrix, np.array(list(contour_3d['center3D'].values())))

point = list(contour_3d['center3D'].values())
temp = np.hstack((np.array([point[0], point[1], point[2] - height / 2]), np.array([1])))
x, y, z = list((ext_matrix @ temp))[:3]
score = 1
string = f"{label} {truncated:.2f} {occluded} {alpha:.2f} " \
         f"{min(x_list):.2f} {min(y_list):.2f} {max(x_list):.2f} {max(y_list):.2f} " \
         f"{height:.2f} {width:.2f} {length:.2f} " \
         f"{x:.2f} {y:.2f} {z:.2f} {ry:.2f} {score}\n"

gravity-123 · 2025-01-22T09:27:06Z

I don't know enough about how to pass the parameters in the json file to your program, so I assigned the parameters directly to the relevant variables, but the final calculation result is different from the actual kitti annotation. I don't know if there is a problem with my assignment or your program. Can you verify that this conversion idea is correct in a more obvious way by calculating the specific values? The following is my modified code.

import numpy as np
import math

def alpha_in_pi(a):
    pi = math.pi
    return a - math.floor((a + pi) / (2 * pi)) * 2 * pi

def gen_alpha(rz, ext_matrix, lidar_center):
    lidar_center = np.hstack((lidar_center, np.array([1])))
    cam_point = ext_matrix @ np.array([np.cos(rz), np.sin(rz), 0, 1])
    cam_point_0 = ext_matrix @ np.array([0, 0, 0, 1])
    ry = -1 * (alpha_in_pi(np.arctan2(cam_point[2] - cam_point_0[2], cam_point[0] - cam_point_0[0])))
    cam_center = ext_matrix @ lidar_center.T
    theta = alpha_in_pi(np.arctan2(cam_center[0], cam_center[2]))
    alpha = ry - theta
    return ry, alpha

ext_matrix = np.array([[ 0.70944543, -0.70469849, -0.00933913 ,-0.06090484],
                        [ 0.00722819 , 0.0205264,  -0.99976318 , 1.74850379],
                        [ 0.7047233 ,  0.70920992 , 0.01965606 ,-7.00787134],
                        [ 0.  ,        0.     ,     0.    ,      1.        ]])
# Assuming obj_3d, rect, contour_3d are defined elsewhere with appropriate data.
# contour_3d = obj_3d[rect['trackId']]['contour']
length, width, height = 4.2972, 1.8512, 1.4052
center3D = [13.7449,18.8812,0.5131]
cur_rz = -2.3562

ry, alpha = gen_alpha(cur_rz, ext_matrix, center3D)

point = center3D
temp = np.hstack((np.array([point[0], point[1], point[2] - height / 2]), np.array([1])))
x, y, z = list((ext_matrix @ temp))[:3]
score = 1
# string = f"{label} {truncated:.2f} {occluded} {alpha:.2f} " \
#          f"{min(x_list):.2f} {min(y_list):.2f} {max(x_list):.2f} {max(y_list):.2f} " \
#          f"{height:.2f} {width:.2f} {length:.2f} " \
#          f"{x:.2f} {y:.2f} {z:.2f} {ry:.2f} {score}\n"

The xtreme1 annotation, camera parameters, and kitti annotation generated by this software for this object are shown in the figure below. I am sure that these two annotations are corresponding annotations because their length, width, and height parameters are the same.

In addition, I wrote a script xtreme1_to_kitti.py to generate annotations and run it in some point cloud detection tasks. It seems to be fine, but the conversion method is different from the code you gave. So I can't completely confirm that my idea is correct.

ShirleySprite · 2025-01-25T04:02:39Z

I am really sorry, but there is a small issue here that you must understand. Neither the person who wrote this conversion script, nor I who took over later, actually use the converted KITTI format ourselves; we do not train models with it. That means this script is intended for others to use. So, after we finished writing the conversion script, if the users found any issues, we made changes accordingly. If they did not raise any problems, we assumed it was correct. If you ask me whether there are any issues with this conversion script, I really cannot give you a definite answer. I can only say that if it does not meet your expectations, then it has issues for you. Please feel free to use the conversion method that meets your expectations. This script is just a reference, and its specific details can be fully adjusted according to your needs.

gravity-123 assigned jaggerwang Dec 25, 2024

jaggerwang assigned ShirleySprite and unassigned jaggerwang Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the conversion relationship between the annotations in BasicAI LiDAR Fusion format and Kitti 3D Object Detection format? #308

What is the conversion relationship between the annotations in BasicAI LiDAR Fusion format and Kitti 3D Object Detection format? #308

gravity-123 commented Dec 25, 2024

ShirleySprite commented Jan 17, 2025 •

edited

Loading

gravity-123 commented Jan 22, 2025

ShirleySprite commented Jan 25, 2025

What is the conversion relationship between the annotations in BasicAI LiDAR Fusion format and Kitti 3D Object Detection format? #308

What is the conversion relationship between the annotations in BasicAI LiDAR Fusion format and Kitti 3D Object Detection format? #308

Comments

gravity-123 commented Dec 25, 2024

ShirleySprite commented Jan 17, 2025 • edited Loading

gravity-123 commented Jan 22, 2025

ShirleySprite commented Jan 25, 2025

ShirleySprite commented Jan 17, 2025 •

edited

Loading