GeoSim is nominated as a best paper candidate in CVPR
Youtube or Bilibili. You may also download it using this link
We present GeoSim, a geometry-guided simulation procedure to insert dynamic objects into videos with greater realism. It helps enable scalable sensor simulation for training and testing autonomy systems, as well as applications like AR, VR, and video editing. We first create 3D assets from real-world data and then use the proposed simulation pipeline to realistically composite 3D assets into existing camera imagery.
We propose a self-supervision model to automatically build vehicle asset bank from the wild without 3D groundtruth.
Using the reconstructed 3D mesh, we can perform novel view warping of the source image to new target poses.
We build a large and diverse asset bank for simulation with over 8 000 unique vehicles.
GeoSim leverages the asset library to insert actors and simulate new images.
Step 1: Scenario Generation We first automatically generate a plausible placement and trajectory that complies with the existing traffic. An asset with a similar viewpoint is selected for rendering.
Step 2: Occlussion-aware rendering We render the selected asset at a new target pose. We account for occlusion by using dense depth from a pre-trained depth completion network. Additionally, we render a soft-shadow corresponding to the selected 3D asset considering a cloudy HDRI as the environement map..
Step 3: Post composition: Finally, we perform post-image composition. We use a synthesis network to handle inconsistent illumination and inpaint the discrepancies at the boundary, so that the vehicle is composited into the image seamlessly.
Video results in 4K resolution (4096x2160)
@inproceedings{chen2021geosim,
title={GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving},
author={Yun Chen, Frieda Rong, Shivam Duggal, Shenlong Wang, Xinchen Yan, Sivabalan Manivasagam, Shangjie Xue, Ersin Yumer, Raquel Urtasun},
year={2021},
booktitle={CVPR},
}