I filter ground-truth for vision 3D object detection. I use the criterion below
if (annotation[‘num_lidar_pts’] < 5 and annotation[‘num_radar_pts’] < 5 and annotation[‘visibility’] <= 1) or (not box_in_image(annotation)) then exclude_the_annotation
Then I use the NuScenesEval in nuscenes-devkit to evaluate the filtered ground-truth, the AP of cars is only 72%. Besides, when I look at the examples generated by NuScenesEval, I found some annotations occur in the filtered gt but not in eval gt:
Yes, we filter by distance, points and bike racks:
That’s the same regardless of whether you use a lidar-based or camera-based method.
Also note that the visualization function takes GT and predictions as-is: