DUFOMap: Efficient Dynamic Awareness Mapping

*Co-first authors 1KTH Royal Institute of Technology  2HKUST

Abstract

The dynamic nature of the real world is one of the main challenges in robotics. The first step in dealing with it is to detect which parts of the world are dynamic. A typical benchmark task is to create a map that contains only the static part of the world to support, for example, localization and planning. Current solutions are often applied in post-processing, where parameter tuning allows the user to adjust the setting for a specific dataset. In this paper, we propose DUFOMap, a novel dynamic awareness mapping framework designed for efficient online processing. Despite having the same parameter settings for all scenarios, it performs better or is on par with state-of-the-art methods. Ray casting is utilized to identify and classify fully observed empty regions. Since these regions have been observed empty, it follows that anything inside them at another time must be dynamic. Evaluation is carried out in various scenarios, including outdoor environments in KITTI and Argoverse 2, open areas on the KTH campus, and with different sensor types. DUFOMap outperforms the state of the art in terms of accuracy and computational efficiency.

Section I: Qualitative Results

Section I-A: DUFOMap processing in Leica-RTC360 dataset

A map created by a Leica 3D laser scanner typically used for construction measurements, but also sometimes used to generate ground truth maps for SLAM. The data is collected between classes with lots of students walking around resulting in a map with lots of dynamic objects included. Using DUFOMap we can find and remove these.

Interactive demo in KITTI 07 sequence (full map). Try yourself 😊

Slide the bar to compare

Left: A map built using ground truth labels (dynamic points marked in yellow).
Right: Static map after DUFOMap after removing points classified as dynamic by DUFOMap.

Section I-C: DUFOMap Ablation Study in RGB-D dataset



The influence of the sensor noise model is illustrated in another experiment. We do this experiment in a smaller-scale, indoor scenario, using RGB-D data to highlight that our method also works here. In this experiment, we use a voxel size of 0.01m. RGB-D sensors based on structured light and/or stereo are notoriously noisy at longer distances. Fig. 4(a) shows raw data from the TUM RGB-D SLAM dataset, featuring people moving around in an environment captured using a noisy RGB-D sensor (Kinect). The noise is especially noticeable by the heavy wall distortion with errors above 0.5m. In Fig. 4(b) to Fig. 4(c), we show the result of detecting dynamic points (yellow) with different parameter values for \(d_s\), that is, sensor noise, keeping \(d_p=1\). As can be seen, by accounting for large enough sensor noise (Fig. 4(c) and Fig. 4(d)), the false positive points decrease substantially. Too large \(d_s\) makes the method more conservative, but as long as there is enough and varied data, it might still work well, as demonstrated in Fig. 4(d).

Section II: Quantitative Results

Section II-A: Quantitative result in all KITTI sequence

Table I presents the dynamic removal performance in all KITTI sequences. Our method achieves the highest performance in all but one case.

Table II shows the dynamic removal results on the dataset from the paper with different sensor setups. Our proposed method, DUFOMap, get high scores on both SA and DA by accurately detecting dynamic points. This enables the generation of complete as well as clean maps for downstream tasks.

Section II-B: Runtime comparison and detailed breakdown

Table III and Fig. 5 present present information on the run time of the different methods for two of the datasets, one with a 64-channel LiDAR (KITTI highway) and one with a 16-channel LiDAR (semi-indoor). In general, our method outperforms other methods in both dense and sparse sensor settings. A detailed breakdown of the execution time for our method is provided in Fig. 5. We observe that the ray-casting step, as expected, is the most computationally intensive.