This model was released on 2018-07-26 and added to Hugging Face Transformers on 2023-01-16.

UPerNet

Overview

The UPerNet model was proposed in Unified Perceptual Parsing for Scene Understanding by Tete Xiao, Yingcheng Liu, Bolei Zhou, Yuning Jiang, Jian Sun. UPerNet is a general framework to effectively segment a wide range of concepts from images, leveraging any vision backbone like ConvNeXt or Swin.

The abstract from the paper is the following:

Humans recognize the visual world at multiple levels: we effortlessly categorize scenes and detect objects inside, while also identifying the textures and surfaces of the objects along with their different compositional parts. In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image. A multi-task framework called UPerNet and a training strategy are developed to learn from heterogeneous image annotations. We benchmark our framework on Unified Perceptual Parsing and show that it is able to effectively segment a wide range of concepts from images. The trained networks are further applied to discover visual knowledge in natural scenes.

<small> UPerNet framework. Taken from the <a href="https://huggingface.co/papers/1807.10221">original paper</a>. </small>

This model was contributed by nielsr. The original code is based on OpenMMLab's mmsegmentation here.

Usage examples

UPerNet is a general framework for semantic segmentation. It can be used with any vision backbone, like so:

python

from transformers import SwinConfig, UperNetConfig, UperNetForSemanticSegmentation


backbone_config = SwinConfig(out_features=["stage1", "stage2", "stage3", "stage4"])

config = UperNetConfig(backbone_config=backbone_config)
model = UperNetForSemanticSegmentation(config)

To use another vision backbone, like ConvNeXt, simply instantiate the model with the appropriate backbone:

python

from transformers import ConvNextConfig, UperNetConfig, UperNetForSemanticSegmentation


backbone_config = ConvNextConfig(out_features=["stage1", "stage2", "stage3", "stage4"])

config = UperNetConfig(backbone_config=backbone_config)
model = UperNetForSemanticSegmentation(config)

Note that this will randomly initialize all the weights of the model.

Resources

A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with UPerNet.

Demo notebooks for UPerNet can be found here.
[UperNetForSemanticSegmentation] is supported by this example script and notebook.
See also: Semantic segmentation task guide

If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.

UperNetConfig

[[autodoc]] UperNetConfig

UperNetForSemanticSegmentation

[[autodoc]] UperNetForSemanticSegmentation - forward