Segmenting Known Objects and Unseen Unknowns without Prior Knowledge

ICCV 2023

¹ Technical University of Munich ² BMW Group ³ Google

Check out also our new ICCV 2025 work here. With Prior2Former, we bring our assumption-free setting forward by introducing evidential uncertainty modeling to mask transformers, delivering outstanding benchmark performance and reliable uncertainty estimates to segment unseen unknown objects without prior knowledge.

💥 GitHub CODE: In the context of Prior2Former, we will also release the code for U3HS. Stay tuned.

Motivation

Segmentation methods assign a known class to each pixel given in input. Even for state-of-the-art approaches, this inevitably enforces decisions that systematically lead to wrong predictions for objects outside the training categories. However, robustness against out-of-distribution samples and corner cases is crucial to avoid dangerous consequences in safety-critical settings, such as in autonomous driving.

Since real-world datasets cannot contain enough data points to adequately sample the long tail of the underlying distribution, models must be able to deal with completely unseen and unknown scenarios. This motivates the need for a method capable of working with training categories, known objects, and especially unseen, unknown objects outside the training data distribution.

Holistic Segmentation Setting

Previous methods either ignored this problem (panoptic segmentation) or targeted it by either re-identifying already-seen unlabeled objects (open-set panoptic segmentation) or introducing external knowledge, e.g., via vision-language models (zero-shot learning and open-vocabulary approaches).

In this work, we propose to extend segmentation with a new setting which we term holistic segmentation. Holistic segmentation aims to identify and separate objects of unseen, unknown categories into instances without prior knowledge while performing panoptic segmentation of known classes.

Method

We tackle this new holistic segmentation problem with U3HS. U3HS finds unknowns as highly uncertain regions and clusters their corresponding instance-aware embeddings into individual objects. For the first time in panoptic segmentation with unknown objects, our U3HS is trained without unknown categories, reducing assumptions, simplifying the training data collection, and leaving the settings as unconstrained as in real-life scenarios.

Results

The figure shows example predictions of U3HS on out-of-distribution data from MS COCO. Remarkably, the model had never seen images containing bears or frisbees (part of the held-out classes), nor had any information about them. Still, the proposed U3HS could robustly segment the unseen, unknown objects despite their high inter-class similarity. Simultaneously, U3HS performed a reasonable panoptic segmentation of known classes.

In our ICCV paper, we demonstrate the effectiveness of U3HS for this new, challenging, and assumptions-free setting called holistic segmentation with extensive experiments on public data from MS COCO, Cityscapes, and Lost&Found.

BibTeX

@inproceedings{gasperini2023holistic_seg, title={Segmenting Known Objects and Unseen Unknowns without Prior Knowledge}, author={Gasperini, Stefano and Marcos-Ramiro, Alvaro and Schmidt, Michael and Navab, Nassir and Busam, Benjamin and Tombari, Federico}, booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision}, year={2023}, pages={19321-19332} }