We propose a novel collaborative contrastive loss for NeRF to segment objects in complex real-world scenes, without any annotation.
We augment NeRF models by appending a parallel segmentation branch to predict point-wise implicit segmentation feature. We propose to update the segmentation feature field using a collaborative loss in both appearance and geometry levels. During inference, a clustering operation (e.g., K-means) is used to generate object masks, based on the rendered feature field.
@misc{https://doi.org/10.48550/arxiv.2209.08776,
doi = {10.48550/ARXIV.2209.08776},
url = {https://arxiv.org/abs/2209.08776},
author = {Fan, Zhiwen and Wang, Peihao and Jiang, Yifan and Gong, Xinyu and Xu, Dejia and Wang, Zhangyang},
keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes},
publisher = {arXiv},
year = {2022}}