Back to Rerun

README

examples/python/mcc/README.md

0.31.42.6 KB
Original Source
<!--[metadata] title = "Single image 3D reconstruction using MCC, SAM, and ZoeDepth" source = "https://github.com/rerun-io/MCC" tags = ["2D", "3D", "Segmentation", "Point cloud", "SAM", "Paper walkthrough"] thumbnail = "https://static.rerun.io/single-image-3D-reconstruction/c54498053d53148cfa43901f39a084c549df2b72/480w.png" thumbnail_dimensions = [480, 480] -->

This example project combines several popular computer vision methods and uses Rerun to visualize the results and how the pieces fit together.

Visual project walkthrough

By combining MetaAI's Segment Anything Model (SAM) and Multiview Compressive Coding (MCC) we can get a 3D object from a single image.

https://vimeo.com/865973817?autoplay=1&loop=1&autopause=0&background=1&muted=1&ratio=10000:8133

The basic idea is to use SAM to create a generic object mask so we can exclude the background.

https://vimeo.com/865973836?autoplay=1&loop=1&autopause=0&background=1&muted=1&ratio=10000:7941

The next step is to generate a depth image. Here we use the awesome ZoeDepth to get realistic depth from the color image.

https://vimeo.com/865973850?autoplay=1&loop=1&autopause=0&background=1&muted=1&ratio=10000:7941

With depth, color, and an object mask we have everything needed to create a colored point cloud of the object from a single view

https://vimeo.com/865973862?autoplay=1&loop=1&autopause=0&background=1&muted=1&ratio=10000:11688

MCC encodes the colored points and then creates a reconstruction by sweeping through the volume, querying the network for occupancy and color at each point.

https://vimeo.com/865973880?autoplay=1&loop=1&autopause=0&background=1&muted=1&ratio=1:1

This is a really great example of how a lot of cool solutions are built these days; by stringing together more targeted pre-trained models. The details of the three building blocks can be found in the respective papers: