0 lidar+radar points

I was looking at the dataset and found that there are several annotations for objects with 0 lidar + radar points. How do you get the depth estimate for such objects?

Also, in the evaluation, these objects are filtered in the devkit. What’s the reason for that?


We remove (GT) boxes without lidar or radar points in them as we cannot guarantee that they are actually visible in the frame (pls see https://github.com/nutonomy/nuscenes-devkit/tree/master/python-sdk/nuscenes/eval/detection#preprocessing for more details)

I see. Is it possible to share how exactly you obtain the GT annotations for such boxes? I’m guessing it using the previous frames…

Hi. The tooling we use can show all cameras, lidars and radars for the 20s scene. Objects are annotated with interpolation. That means that if we see an object at 1s and at 3s, the frames in between are linearly interpolated, even if the object is occluded. For evaluation purposes we later remove the boxes without lidar/radar points, as there was no chance for our detector to detect these. Of course there can be all kinds of exceptions, e.g. often you see an object through the window of a car, but may not get lidar returns from it.

1 Like

Thanks a lot for the info. I am thinking although it seems fair, it might bias the detection metric in favour of lidar compared to camera.

Yes, that is a fair point. But I would say it was never a “fair” to begin with, since vision based methods primarily have uncertainty along the depth dimension, which makes it very hard to achieve a good mAP at a large distance.

I think at a large distance lidar doesn’t even get the points so it’s even harder for it to detect the objects. RGB should do better although the depth estimate would not be accurate. I think defining a fair metric for different modalities is quite complex especially when evaluating at a large distance.

1 Like