-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What is the conversion relationship between the annotations in BasicAI LiDAR Fusion format and Kitti 3D Object Detection format? #308
Comments
Hello, The KITTI format requires camera parameters because the coordinates in the KITTI format consist of 2D boxes on the image and 3D boxes in the camera coordinate system. Please note that it is the camera coordinate system, not the point cloud coordinate system. Moreover, the representation of these 3D coordinates is specific to the KITTI format. The specific conversion code is as follows: import numpy as np
def alpha_in_pi(a):
pi = math.pi
return a - math.floor((a + pi) / (2 * pi)) * 2 * pi
def gen_alpha(rz, ext_matrix, lidar_center):
lidar_center = np.hstack((lidar_center, np.array([1])))
cam_point = ext_matrix @ np.array([np.cos(rz), np.sin(rz), 0, 1])
cam_point_0 = ext_matrix @ np.array([0, 0, 0, 1])
ry = -1 * (alpha_in_pi(np.arctan2(cam_point[2] - cam_point_0[2], cam_point[0] - cam_point_0[0])))
cam_center = ext_matrix @ lidar_center.T
theta = alpha_in_pi(np.arctan2(cam_center[0], cam_center[2]))
alpha = ry - theta
return ry, alpha
# Assuming obj_3d, rect, contour_3d are defined elsewhere with appropriate data.
contour_3d = obj_3d[rect['trackId']]['contour']
length, width, height = contour_3d['size3D'].values()
cur_rz = contour_3d['rotation3D']['z']
ry, alpha = gen_alpha(cur_rz, ext_matrix, np.array(list(contour_3d['center3D'].values())))
point = list(contour_3d['center3D'].values())
temp = np.hstack((np.array([point[0], point[1], point[2] - height / 2]), np.array([1])))
x, y, z = list((ext_matrix @ temp))[:3]
score = 1
string = f"{label} {truncated:.2f} {occluded} {alpha:.2f} " \
f"{min(x_list):.2f} {min(y_list):.2f} {max(x_list):.2f} {max(y_list):.2f} " \
f"{height:.2f} {width:.2f} {length:.2f} " \
f"{x:.2f} {y:.2f} {z:.2f} {ry:.2f} {score}\n" |
I am really sorry, but there is a small issue here that you must understand. Neither the person who wrote this conversion script, nor I who took over later, actually use the converted KITTI format ourselves; we do not train models with it. That means this script is intended for others to use. So, after we finished writing the conversion script, if the users found any issues, we made changes accordingly. If they did not raise any problems, we assumed it was correct. If you ask me whether there are any issues with this conversion script, I really cannot give you a definite answer. I can only say that if it does not meet your expectations, then it has issues for you. Please feel free to use the conversion method that meets your expectations. This script is just a reference, and its specific details can be fully adjusted according to your needs. |
I want to make a kitti-like dataset without images, for which I studied the conversion of annotations from BasicAI LiDAR Fusion format to Kitti 3D Object Detection format.
I used the LiDAR without any configs dataset provided in this URL https://docs.basic.ai/docs/upload-data and annotated it. The annotation example is shown in the figure below.
When I exported, I saved the BasicAI LiDAR Fusion format and Kitti 3D Object Detection format respectively. I compared the annotations of the 3d bounding box in the json file and the txt file. As shown in the figure, the size of the box is the same, except for the order. How is the center position obtained?
Is it obtained by txt_position = camera_external * json_position? But the result is wrong when calculated in this way. I read the description of camera extrinsic parameters on the webpage https://docs.basic.ai/docs/camera-intrinsic-extrinsic-and-distortion-in-camera-calibration#practice-in-basicai-, as shown in the figure. The order of the extrinsic parameters provided in the above website seems to be different.
The figure below shows the parameters corresponding to the above set of data. This parameter is the second extrinsic parameter in LiDAR_Fusion_without_any_configs/camera_config/08.json.
So how should the position conversion from BasicAI LiDAR Fusion format to Kitti 3D Object Detection format be calculated? Is my calculation formula wrong, or is there a problem with the camera extrinsic parameters?
The text was updated successfully, but these errors were encountered: