Zero-shot Detection Results on Videos

This page showcases layout estimation and 3D object detection results on virtual room-tour videos. Each row compares the ground truth with results from SpatialLM.

Video frames Ground truth SpatialLM