Note: This project has been deprecated, please use Detic_new.

Detecting Twenty-thousand Classes using Image-level Supervision

Description

Detic: A Detector with image classes that can use image-level labels to easily train detectors.

Detecting Twenty-thousand Classes using Image-level Supervision, Xingyi Zhou, Rohit Girdhar, Armand Joulin, Philipp Krähenbühl, Ishan Misra, ECCV 2022 (arXiv 2201.02605)

Usage

Installation

Detic requires to install CLIP.

shell

pip install git+https://github.com/openai/CLIP.git

Demo

Inference with existing dataset vocabulary embeddings

First, go to the Detic project folder.

shell

cd projects/Detic

Then, download the pre-computed CLIP embeddings from dataset metainfo to the datasets/metadata folder. The CLIP embeddings will be loaded to the zero-shot classifier during inference. For example, you can download LVIS's class name embeddings with the following command:

shell

wget -P datasets/metadata https://raw.githubusercontent.com/facebookresearch/Detic/main/datasets/metadata/lvis_v1_clip_a%2Bcname.npy

You can run demo like this:

shell

python demo.py \
  ${IMAGE_PATH} \
  ${CONFIG_PATH} \
  ${MODEL_PATH} \
  --show \
  --score-thr 0.5 \
  --dataset lvis

Inference with custom vocabularies

Detic can detects any class given class names by using CLIP.

You can detect custom classes with --class-name command:

python demo.py \
  ${IMAGE_PATH} \
  ${CONFIG_PATH} \
  ${MODEL_PATH} \
  --show \
  --score-thr 0.3 \
  --class-name headphone webcam paper coffe

Note that headphone, paper and coffe (typo intended) are not LVIS classes. Despite the misspelled class name, Detic can produce a reasonable detection for coffe.

Results

Here we only provide the Detic Swin-B model for the open vocabulary demo. Multi-dataset training and open-vocabulary testing will be supported in the future.

To find more variants, please visit the official model zoo.

Backbone	Training data	Config	Download
Swin-B	ImageNet-21K & LVIS & COCO	config	model

Citation

If you find Detic is useful in your research or applications, please consider giving a star 🌟 to the official repository and citing Detic by the following BibTeX entry.

BibTeX

@inproceedings{zhou2022detecting,
  title={Detecting Twenty-thousand Classes using Image-level Supervision},
  author={Zhou, Xingyi and Girdhar, Rohit and Joulin, Armand and Kr{\"a}henb{\"u}hl, Philipp and Misra, Ishan},
  booktitle={ECCV},
  year={2022}
}

Checklist