Filter gt for vision detection

xucj98 · October 21, 2019, 8:22am

I filter ground-truth for vision 3D object detection. I use the criterion below
if (annotation[‘num_lidar_pts’] < 5 and annotation[‘num_radar_pts’] < 5 and annotation[‘visibility’] <= 1) or (not box_in_image(annotation)) then exclude_the_annotation
Then I use the NuScenesEval in nuscenes-devkit to evaluate the filtered ground-truth, the AP of cars is only 72%. Besides, when I look at the examples generated by NuScenesEval, I found some annotations occur in the filtered gt but not in eval gt:

Have you filtered the ground-truth when evaluate? Do you have a criterion to filter ground-truth for vision detection?

holger-motional · October 21, 2019, 8:13pm

Yes, we filter by distance, points and bike racks:

github.com

nutonomy/nuscenes-devkit/blob/master/python-sdk/nuscenes/eval/detection/loaders.py#L164


Applies filtering to boxes. Distance, bike-racks and points per box.
:param nusc: An instance of the NuScenes class.
:param eval_boxes: An instance of the EvalBoxes class.
:param max_dist: Maps the detection name to the eval distance threshold for that class.
:param verbose: Whether to print to stdout.
"""
# Accumulators for number of filtered boxes.
total, dist_filter, point_filter, bike_rack_filter = 0, 0, 0, 0
for ind, sample_token in enumerate(eval_boxes.sample_tokens):


    # Filter on distance first
    total += len(eval_boxes[sample_token])
    eval_boxes.boxes[sample_token] = [box for box in eval_boxes[sample_token] if
                                      box.ego_dist < max_dist[box.detection_name]]
    dist_filter += len(eval_boxes[sample_token])


    # Then remove boxes with zero points in them. Eval boxes have -1 points by default.
    eval_boxes.boxes[sample_token] = [box for box in eval_boxes[sample_token] if not box.num_pts == 0]
    point_filter += len(eval_boxes[sample_token])


    # Perform bike-rack filtering

That’s the same regardless of whether you use a lidar-based or camera-based method.
Also note that the visualization function takes GT and predictions as-is:

github.com

nutonomy/nuscenes-devkit/blob/master/python-sdk/nuscenes/eval/detection/render.py#L19


from nuscenes import NuScenes
from nuscenes.eval.detection.constants import TP_METRICS, DETECTION_NAMES, DETECTION_COLORS, TP_METRICS_UNITS, \
    PRETTY_DETECTION_NAMES, PRETTY_TP_METRICS
from nuscenes.eval.detection.data_classes import EvalBoxes
from nuscenes.eval.detection.data_classes import MetricDataList, DetectionMetrics
from nuscenes.eval.detection.utils import boxes_to_sensor
from nuscenes.utils.data_classes import LidarPointCloud
from nuscenes.utils.geometry_utils import view_points




def visualize_sample(nusc: NuScenes,
                     sample_token: str,
                     gt_boxes: EvalBoxes,
                     pred_boxes: EvalBoxes,
                     nsweeps: int = 1,
                     conf_th: float = 0.15,
                     eval_range: float = 50,
                     verbose: bool = True,
                     savepath: str = None) -> None:
    """
    Visualizes a sample from BEV with annotations and detection results.