docs/source/en/model_doc/upernet.md
This model was released on 2018-07-26 and added to Hugging Face Transformers on 2023-01-16.
The UPerNet model was proposed in Unified Perceptual Parsing for Scene Understanding by Tete Xiao, Yingcheng Liu, Bolei Zhou, Yuning Jiang, Jian Sun. UPerNet is a general framework to effectively segment a wide range of concepts from images, leveraging any vision backbone like ConvNeXt or Swin.
The abstract from the paper is the following:
Humans recognize the visual world at multiple levels: we effortlessly categorize scenes and detect objects inside, while also identifying the textures and surfaces of the objects along with their different compositional parts. In this paper, we study a new task called Unified Perceptual Parsing, which requires the machine vision systems to recognize as many visual concepts as possible from a given image. A multi-task framework called UPerNet and a training strategy are developed to learn from heterogeneous image annotations. We benchmark our framework on Unified Perceptual Parsing and show that it is able to effectively segment a wide range of concepts from images. The trained networks are further applied to discover visual knowledge in natural scenes.
<small> UPerNet framework. Taken from the <a href="https://huggingface.co/papers/1807.10221">original paper</a>. </small>
This model was contributed by nielsr. The original code is based on OpenMMLab's mmsegmentation here.
UPerNet is a general framework for semantic segmentation. It can be used with any vision backbone, like so:
from transformers import SwinConfig, UperNetConfig, UperNetForSemanticSegmentation
backbone_config = SwinConfig(out_features=["stage1", "stage2", "stage3", "stage4"])
config = UperNetConfig(backbone_config=backbone_config)
model = UperNetForSemanticSegmentation(config)
To use another vision backbone, like ConvNeXt, simply instantiate the model with the appropriate backbone:
from transformers import ConvNextConfig, UperNetConfig, UperNetForSemanticSegmentation
backbone_config = ConvNextConfig(out_features=["stage1", "stage2", "stage3", "stage4"])
config = UperNetConfig(backbone_config=backbone_config)
model = UperNetForSemanticSegmentation(config)
Note that this will randomly initialize all the weights of the model.
A list of official Hugging Face and community (indicated by 🌎) resources to help you get started with UPerNet.
UperNetForSemanticSegmentation] is supported by this example script and notebook.If you're interested in submitting a resource to be included here, please feel free to open a Pull Request and we'll review it! The resource should ideally demonstrate something new instead of duplicating an existing resource.
[[autodoc]] UperNetConfig
[[autodoc]] UperNetForSemanticSegmentation - forward