docs/en/notes/changelog_v2.x.md
WandbLogger hookRename config files of Mask2Former (#7571)
<table align="center"> <thead> <tr align='center'> <td>before v2.25.0</td> <td>after v2.25.0</td> </tr> </thead> <tbody><tr valign='top'> <th>mask2former_xxx_coco.py represents config files for panoptic segmentation.mask2former_xxx_coco.py represents config files for instance segmentation.mask2former_xxx_coco-panoptic.py represents config files for panoptic segmentation.interval != 1 (#7784)Support dedicated WandbLogger hook (#7459)
Users can set
cfg.log_config.hooks = [
dict(type='MMDetWandbHook',
init_kwargs={'project': 'MMDetection-tutorial'},
interval=10,
log_checkpoint=True,
log_checkpoint_metadata=True,
num_eval_images=10)]
in the config to use MMDetWandbHook. Example can be found in this colab tutorial
Add AvoidOOM to avoid OOM (#7434, #8091)
Try to use AvoidCUDAOOM to avoid GPU out of memory. It will first retry after calling torch.cuda.empty_cache(). If it still fails, it will then retry by converting the type of inputs to FP16 format. If it still fails, it will try to copy inputs from GPUs to CPUs to continue computing. Try AvoidOOM in code to make the code continue to run when GPU memory runs out:
from mmdet.utils import AvoidCUDAOOM
output = AvoidCUDAOOM.retry_if_cuda_oom(some_function)(input1, input2)
Users can also try AvoidCUDAOOM as a decorator to make the code continue to run when GPU memory runs out:
from mmdet.utils import AvoidCUDAOOM
@AvoidCUDAOOM.retry_if_cuda_oom
def function(*args, **kwargs):
...
return xxx
Support reading gpu_collect from cfg.evaluation.gpu_collect (#7672)
Speedup the Video Inference by Accelerating data-loading Stage (#7832)
Support replacing the ${key} with the value of cfg.key (#7492)
Accelerate result analysis in analyze_result.py. The evaluation time is speedup by 10 ~ 15 times and only tasks 10 ~ 15 minutes now. (#7891)
Support to set block_dilations in DilatedEncoder (#7812)
Support panoptic segmentation result analysis (#7922)
Release DyHead with Swin-Large backbone (#7733)
Documentations updating and adding
act_cfg in SwinTransformer (#7794)markdownlint with mdformat for avoiding installing ruby (#8009)A total of 20 developers contributed to this release.
Thanks @ZwwWayne, @DarthThomas, @solyaH, @LutingWang, @chenxinfeng4, @Czm369, @Chenastron, @chhluo, @austinmw, @Shanyaliux @hellock, @Y-M-Y, @jbwang1997, @hhaAndroid, @Irvingao, @zhanggefan, @BIGWangYuDong, @Keiku, @PeterVennerstrom, @ayulockin
Support Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation, see example configs (#7501)
Support Class Aware Sampler, users can set
data=dict(train_dataloader=dict(class_aware_sampler=dict(num_sample_class=1))))
in the config to use ClassAwareSampler. Examples can be found in the configs of OpenImages Dataset. (#7436)
Support automatically scaling LR according to GPU number and samples per GPU. (#7482) In each config, there is a corresponding config of auto-scaling LR as below,
auto_scale_lr = dict(enable=True, base_batch_size=N)
where N is the batch size used for the current learning rate in the config (also equals to samples_per_gpu * gpu number to train this config).
By default, we set enable=False so that the original usages will not be affected. Users can set enable=True in each config or add --auto-scale-lr after the command line to enable this feature and should check the correctness of base_batch_size in customized configs.
Support setting dataloader arguments in config and add functions to handle config compatibility. (#7668) The comparison between the old and new usages is as below.
<table align="center"> <thead> <tr align='center'> <td>v2.23.0</td> <td>v2.24.0</td> </tr> </thead> <tbody><tr valign='top'> <th>data = dict(
samples_per_gpu=64, workers_per_gpu=4,
train=dict(type='xxx', ...),
val=dict(type='xxx', samples_per_gpu=4, ...),
test=dict(type='xxx', ...),
)
# A recommended config that is clear
data = dict(
train=dict(type='xxx', ...),
val=dict(type='xxx', ...),
test=dict(type='xxx', ...),
# Use different batch size during inference.
train_dataloader=dict(samples_per_gpu=64, workers_per_gpu=4),
val_dataloader=dict(samples_per_gpu=8, workers_per_gpu=2),
test_dataloader=dict(samples_per_gpu=8, workers_per_gpu=2),
)
# Old style still works but allows to set more arguments about data loaders
data = dict(
samples_per_gpu=64, # only works for train_dataloader
workers_per_gpu=4, # only works for train_dataloader
train=dict(type='xxx', ...),
val=dict(type='xxx', ...),
test=dict(type='xxx', ...),
# Use different batch size during inference.
val_dataloader=dict(samples_per_gpu=8, workers_per_gpu=2),
test_dataloader=dict(samples_per_gpu=8, workers_per_gpu=2),
)
Support memory profile hook. Users can use it to monitor the memory usages during training as below (#7560)
custom_hooks = [
dict(type='MemoryProfilerHook', interval=50)
]
Support to run on PyTorch with MLU chip (#7578)
Support re-spliting data batch with tag (#7641)
Support the DiceCost used by K-Net in MaskHungarianAssigner (#7716)
Support splitting COCO data for Semi-supervised object detection (#7431)
Support Pathlib for Config.fromfile (#7685)
Support to use file client in OpenImages dataset (#7433)
Add a probability parameter to Mosaic transformation (#7371)
Support specifying interpolation mode in Resize pipeline (#7585)
end_level in Necks, which should be the index of the end input backbone level (#7502)mix_results may be None in MultiImageMixDataset (#7530)load_json_logs of analyze_logs.py for resumed training logs (#7732)out_file in image_demo.py (#7676)SimOTAAssigner (#7516)A total of 27 developers contributed to this release. Thanks @jovialio, @zhangsanfeng2022, @HarryZJ, @jamiechoi1995, @nestiank, @PeterH0323, @RangeKing, @Y-M-Y, @mattcasey02, @weiji14, @Yulv-git, @xiefeifeihu, @FANG-MING, @meng976537406, @nijkah, @sudz123, @CCODING04, @SheffieldCao, @Czm369, @BIGWangYuDong, @zytx121, @jbwang1997, @chhluo, @jshilong, @RangiLyu, @hhaAndroid, @ZwwWayne
MMDET_DATASETS, users don't have to modify the corresponding path in config files anymore.MMDET_DATASETS, users don't have to modify the corresponding path in config files anymore. (#7386)dist_train.sh so that the script can be used to support launching multi-node training on machines without slurm (#7415)get_classes and FileClient (#7276)get_bboxes in yolox_head to float32 (#7324)finetune.md (#7178)nproc in coco_panoptic.py for panoptic quality computing (#7315)A total of 27 developers contributed to this release. Thanks @ZwwWayne, @haofanwang, @shinya7y, @chhluo, @yangrisheng, @triple-Mu, @jbwang1997, @HikariTJU, @imflash217, @274869388, @zytx121, @matrixgame2018, @jamiechoi1995, @BIGWangYuDong, @JingweiZhang12, @Xiangxu-0103, @hhaAndroid, @jshilong, @osbm, @ceroytres, @bunge-bedstraw-herb, @Youth-Got, @daavoo, @jiangyitong, @RangiLyu, @CCODING04, @yarkable
In order to support the visualization for Panoptic Segmentation, the num_classes can not be None when using the get_palette function to determine whether to use the panoptic palette.
key_score is None (#7101)docs_zh-CN/tutorials/init_cfg.md (#7188)A total of 20 developers contributed to this release. Thanks @ZwwWayne, @hhaAndroid, @RangiLyu, @AronLin, @BIGWangYuDong, @jbwang1997, @zytx121, @chhluo, @shinya7y, @LuooChen, @dvansa, @siatwangmin, @del-zhenwu, @vikashranjan26, @haofanwang, @jamiechoi1995, @HJoonKwon, @yarkable, @zhijian-liu, @RangeKing
To standardize the contents in config READMEs and meta files of OpenMMLab projects, the READMEs and meta files in each config directory have been significantly changed. The template will be released in the future, for now, you can refer to the examples of README for algorithm, dataset and backbone. To align with the standard, the configs in dcn are put into to two directories named dcn and dcnv2.
__repr__ of Compose (#6951)SigmoidGeometricMean (#7090)A total of 26 developers contributed to this release. Thanks @del-zhenwu, @zimoqingfeng, @srishilesh, @imyhxy, @jenhaoyang, @jliu-ac, @kimnamu, @ShengliLiu, @garvan2021, @ciusji, @DIYer22, @kimnamu, @q3394101, @zhouzaida, @gaotongxiao, @topsy404, @AntoAndGar, @jbwang1997, @nijkah, @ZwwWayne, @Czm369, @jshilong, @RangiLyu, @BIGWangYuDong, @hhaAndroid, @AronLin
loss_weight of the PAA head (#6744)gt_semantic_seg in batch collating (#6837)classwise (#6845)get_local_path (#6719)sync_norm_hook when the BN layer does not exist (#6852)A total of 16 developers contributed to this release. Thanks @ZwwWayne, @Czm369, @jshilong, @RangiLyu, @BIGWangYuDong, @hhaAndroid, @jamiechoi1995, @AronLin, @Keiku, @gkagkos, @fcakyon, @www516717402, @vansin, @zactodd, @kimnamu, @jenhaoyang
get_ann_info to dataset_wrappers (#6526)bbox_clip_border for the augmentations of YOLOX (#6730)detect_anomalous_params (#6697)A total of 11 developers contributed to this release. Thanks @ZwwWayne, @LJoson, @Czm369, @jshilong, @ZCMax, @RangiLyu, @BIGWangYuDong, @hhaAndroid, @zhaoxin111, @GT9505, @shinya7y
persistent_workers for Pytorch >= 1.7persistent_workers for Pytorch >= 1.7 (#6435)A total of 11 developers contributed to this release. Thanks @FloydHsiu, @RangiLyu, @ZwwWayne, @AndreaPi, @st9007a, @hachreak, @BIGWangYuDong, @hhaAndroid, @AronLin, @chhluo, @vealocia, @HarborYuan, @st9007a, @jshilong
trunc_normal_init in PVT and Swin-Transformer (#6432)A total of 11 developers contributed to this release. Thanks @st9007a, @hachreak, @HarborYuan, @vealocia, @chhluo, @AndreaPi, @AronLin, @BIGWangYuDong, @hhaAndroid, @RangiLyu, @ZwwWayne
get_bboxes and speed up inference (#5317, #6003, #6369, #6268, #6315)forward_dummy of YOLACT to enable get_flops (#6079)A total of 18 developers contributed to this release. Thanks @Boyden, @onnkeat, @st9007a, @vealocia, @yhcao6, @DapangpangX, @yellowdolphin, @cclauss, @kennymckormick, @pingguokiller, @collinzrj, @AndreaPi, @AronLin, @BIGWangYuDong, @hhaAndroid, @jshilong, @RangiLyu, @ZwwWayne
YOLOv3 inferenceYOLOv3 inference (#5991)YOLOX (#5983)val workflow in YOLACT (#5986)torchserve (#5936)onnxsim with dynamic input shape (#6117)model_wrappers (#5975)centernet_head (#6016)imshow_bboxes (#6034)aug_test of HTC when the length of det_bboxes is 0 (#6088)dynamic_axes parameter error in ONNX dynamic shape export (#6104)dynamic_shape bug of SyncRandomSizeHook (#6144)Mosaic transform (#5897)docs_zh-CN/tutorials/customize_dataset.md (#5915)conventions.md (#5825)PanopticFPN (#5996)extra_repr for DropBlock layer to get details in the model printing (#6140)opencv-python-headless dependency by albumentations (#5868)A total of 24 developers contributed to this release. Thanks @morkovka1337, @HarborYuan, @guillaumefrd, @guigarfr, @www516717402, @gaotongxiao, @ypwhs, @MartaYang, @shinya7y, @justiceeem, @zhaojinjian0000, @VVsssssk, @aravind-anantha, @wangbo-zhao, @czczup, @whai362, @czczup, @marijnl, @AronLin, @BIGWangYuDong, @hhaAndroid, @jshilong, @RangiLyu, @ZwwWayne
PatchEmbed and PatchMerging with AdaptivePadding (#5952)imshow_det_bboxes (#5845)ImageToTensor contiguous (#5756)regress_by_class in RoIHead in some cases (#5884)multiscale_output is defined but not used in HRNet (#5887)data_pipeline and (#5662)A total of 19 developers contributed to this release. Thanks @ypwhs, @zywvvd, @collinzrj, @OceanPang, @ddonatien, @@haotian-liu, @viibridges, @Muyun99, @guigarfr, @zhaojinjian0000, @jbwang1997,@wangbo-zhao, @xvjiarui, @RangiLyu, @jshilong, @AronLin, @BIGWangYuDong, @hhaAndroid, @ZwwWayne
upsample_like to interpolate_as for more general usage (#5788)A total of 14 developers contributed to this release. Thanks @HAOCHENYE, @xiaohu2015, @HsLOL, @zhiqwang, @Adamdad, @shinya7y, @Johnson-Wang, @RangiLyu, @jshilong, @mmeendez8, @AronLin, @BIGWangYuDong, @hhaAndroid, @ZwwWayne
upsample_like (#5732)ignore_index to CrossEntropyLoss (#5646)NumClassCheckHook when it is not used. (#5626)multiclass_nms that returns the global indices (#5592)valid_mask logic error in RPNHead (#5562)get_root_logger when cfg.log_level is not None (#5521)IterBasedRunner (#5490)reduction_override in all loss functions (#5515)init_cfg (#5273)A total of 18 developers contributed to this release. Thanks @OceanPang, @AronLin, @hellock, @Outsider565, @RangiLyu, @ElectronicElephant, @likyoo, @BIGWangYuDong, @hhaAndroid, @noobying, @yyz561, @likyoo, @zeakey, @ZwwWayne, @ChenyangLiu, @johnson-magic, @qingswu, @BuxianChen
simple_test to dense heads to improve the consistency of single-stage and two-stage detectorstest_mixins to single image test to improve efficiency and readabilityreduction_override in MSELoss (#5437)multiclass_nms (#4980)MultiScaleDeformableAttention (#5338)onnx_export of bbox_head when setting reg_class_agnostic (#5468).md (#5315)simple_test to dense heads to improve the consistency of single-stage and two-stage detectors (#5264)test_mixins to single image test to improve efficiency and readability (#5249)anchor_generator and point_generator (#5349)mask_head of the HTC algorithm (#5389)model.pretrained to model.backbone.init_cfg (#5370)mask_soft config option to allow non-binary masks (#4615)det_bboxes length is 0 (#5221)iou_thr variable naming errors in VOC recall calculation function (#5195)min_bbox_size (#5011)MMDetection is going through big refactoring for more general and convenient usages during the releases from v2.12.0 to v2.15.0 (maybe longer). In v2.12.0 MMDetection inevitably brings some BC-breakings, including the MMCV dependency, model initialization, model registry, and mask AP evaluation.
BaseModule for unified parameter initialization, model registry, and the CUDA operator MultiScaleDeformableAttn for Deformable DETR. Note that MMCV 1.3.2 already contains all the features used by MMDet but has known issues. Therefore, we recommend users skip MMCV v1.3.2 and use v1.3.3, though v1.3.2 might work for most cases.BaseModule that accepts init_cfg to allow the modules' parameters initialized in a flexible and unified manner. Now the users need to explicitly call model.init_weights() in the training script to initialize the model (as in here, previously this was handled by the detector. The models in MMDetection have been re-benchmarked to ensure accuracy based on PR #4750. The downstream projects should update their code accordingly to use MMDetection v2.12.0.bbox during mask AP calculation. This change does not affect the overall mask AP evaluation and aligns the mask AP of similar models in other projects like Detectron2.bbox_overlaps to save memory and keep speed (#4889)__repr__ in custom dataset to count the number of instances (#4756)MODEL_REGISTRY (#5059)test_robustness in documentation (#4917)pycocotools instead of mmpycocotools to fully support Detectron2 and MMDetection in one environment (#4939)meta is not in old checkpoints (#4936)reduce_mean (#4923, #4978, #5058)mask_head when using CARAFE (#5062)supplement_mask bug when there are zero-size RoIs (#5065)Highlights
New Features
Improvements
ann_ids in COCODataset to ensure it is unique (#4789)Bug Fixes
EvalHook (#4582)iou_thrs bug in RPN evaluation (#4581)import_modules_from_strings (#4601)CLASSES is correctly initialized in the initialization of XMLDataset (#4555)train_cfg and test_cfg into model in configsout_file is not None and show==False (#4442)score_factor that will decrease the performance of YOLOv3 (#4473)ImageToTensor for batch inference (#4408)imshow_det_bboxes visualization backend from OpenCV to Matplotlib (#4389)ImageToTensor in image_demo.py (#4400)reg_decoded_bbox option in bbox heads (#4467)multiclass_nms (#4362)img_norm_cfg in FCOS-HRNet models with updated performance and models (#4250)mmdet.ops (#4325)GFLHead (#4210)CrossEntropyLoss (#4224)gpu_id in distributed training mode (#4163)reduce_mean function (#4056)BboxOverlaps2D, and re-implement giou_loss using bbox_overlaps (#3936)--show-dir option in test script (#4025)mmdet.core.export and use generate_inputs_and_wrap_model for pytorch2onnx (#3857, #3912)tensor2imgs (#4010)FP16 related methods are imported from mmcv instead of mmdet. (#3766, #3822)
Mixed precision training utils in mmdet.core.fp16 are moved to mmcv.runner, including force_fp32, auto_fp16, wrap_fp16_model, and Fp16OptimizerHook. A deprecation warning will be raised if users attempt to import those methods from mmdet.core.fp16, and will be finally removed in V2.10.0.
[0, N-1] represents foreground classes and N indicates background classes for all models. (#3221)
Before v2.5.0, the background label for RPN is 0, and N for other heads. Now the behavior is consistent for all models. Thus self.background_labels in dense_heads is removed and all heads use self.num_classes to indicate the class index of background labels.
This change has no effect on the pre-trained models in the v2.x model zoo, but will affect the training of all models with RPN heads. Two-stage detectors whose RPN head uses softmax will be affected because the order of categories is changed.
Only call get_subset_by_classes when test_mode=True and self.filter_empty_gt=True (#3695)
Function get_subset_by_classes in dataset is refactored and only filters out images when test_mode=True and self.filter_empty_gt=True.
In the original implementation, get_subset_by_classes is not related to the flag self.filter_empty_gt and will only be called when the classes is set during initialization no matter test_mode is True or False. This brings ambiguous behavior and potential bugs in many cases. After v2.5.0, if filter_empty_gt=False, no matter whether the classes are specified in a dataset, the dataset will use all the images in the annotations. If filter_empty_gt=True and test_mode=True, no matter whether the classes are specified, the dataset will call ``get_subset_by_classes` to check the images and filter out images containing no GT boxes. Therefore, the users should be responsible for the data filtering/cleaning process for the test dataset.
Shear, Rotate, Translate Augmentation (#3656, #3619, #3687)Constrast, Equalize, Color, and Brightness. (#3643)num_pos is 0 (#3702)self.rpn_head.test_cfg in RPNTestMixin by using self.rpn_head in rpn head (#3808)Conv2d from mmcv.ops (#3791)https://download.openmmlab.com/mmcv/dist/index.html for installing MMCV (#3840)mmcv.utils.collect_env for collecting environment information to avoid duplicate codes (#3779)simple_test_bboxes in SABL (#3853)Highlights
mmdet package to PyPI since v2.3.0Backwards Incompatible Changes
replace_ImageToTensor (#3686) to convert legacy test data pipelines during dataset initialization.mmlvis and mmpycocotools for COCO and LVIS dataset (#3727). The APIs are fully compatible with the original lvis and pycocotools. Users need to uninstall the existing pycocotools and lvis packages in their environment first and install mmlvis & mmpycocotools.Bug Fixes
New Features
Improvements
Highlights
mmcv.ops. For backward compatibility mmdet.ops is kept as warppers of mmcv.ops.Bug Fixes
inside_flags.any() is False in dense heads (#3242)MultiScaleFlipAug (#3262)stem_channels (#3333)iou_thrs is not actually used during evaluation in coco.py (#3407)New Features
Improvements
RandomCrop (#3153)Accuracy module to set threshold (#3155)to_float32 and norm_cfg in RegNets configs (#3210)mmcv.ops and keep mmdet.ops as warppers for backward compatibility (#3232)(#3457)Highlights
Bug Fixes
register_module() (#3092, #3161)num_classes in SSD (#3142)rstrip in tools/publish_model.pyflip_ratio default value in RandomFLip pipeline (#3106)num_classes check (#2964)iou_calculator (#2975)New Features
with_cp for BasicBlock (#2891)stem_channels argument for ResNet (#2954)Improvements
concat mode in GRoI (#3098)autorescale-lr (#3080)len(data['img_metas']) to indicate num_samples (#3073, #3053)Highlights
Bug Fixes
--validate to --no-validate to enable validation after training epochs by default. (#2651)RandomCrop. Fix the bug that misses to handle gt_bboxes_ignore, gt_label_ignore, and gt_masks_ignore in RandomCrop, MinIoURandomCrop and Expand modules. (#2810)base_channels of regnet (#2917)New Features
mmcv.FileClient to support different storage backends (#2712)Improvements
imgs_per_gpu is used. (#2700)ori_filename to img_metas and use it in test show-dir (#2612)img_fields to handle multiple images during image transform (#2800)['img'] as default img_fields for back compatibility (#2809)open-mmlab://resnet50_caffe and open-mmlab://resnet50_caffe_bgr to open-mmlab://detectron/resnet50_caffe and open-mmlab://detectron2/resnet50_caffe. (#2832)c10::half in CARAFE (#2890)mmdet.core.optimizer (#2947)In this release, we made lots of major refactoring and modifications.
Faster speed. We optimize the training and inference speed for common models, achieving up to 30% speedup for training and 25% for inference. Please refer to model zoo for details.
Higher performance. We change some default hyperparameters with no additional cost, which leads to a gain of performance for most models. Please refer to compatibility for details.
More documentation and tutorials. We add a bunch of documentation and tutorials to help users get started more smoothly. Read it here.
Support PyTorch 1.5. The support for 1.1 and 1.2 is dropped, and we switch to some new APIs.
Better configuration system. Inheritance is supported to reduce the redundancy of configs.
Better modular design. Towards the goal of simplicity and flexibility, we simplify some encapsulation while add more other configurable modules like BBoxCoder, IoUCalculator, OptimizerConstructor, RoIHead. Target computation is also included in heads and the call hierarchy is simpler.
Breaking Changes Models training with MMDetection 1.x are not fully compatible with 2.0, please refer to the compatibility doc for the details and how to migrate to the new version.
Improvements
train_cfg and test_cfg as class members in all anchor heads. (#2422)lr an optional argument for optimizers. (#2509)Bug Fixes
New Features
Highlights
Breaking Changes
__init__ api is changed to be the same as official DDP.mask_head field in HTC config files is modified.Bug Fixes
torch.uint8 in PyTorch 1.4. (#2105)pad_val is unused in Pad transform. (#2093)Improvements
New Features
worker_init_fn() in data_loader when seed is set. (#2066, #2111)This release mainly improves the code quality and add more docstrings.
Highlights
build_conv_layer and ConvModule like the normal conv layer.Bug Fixes
refine_bboxes(). (#1962)seg_prefix is a list. (#1906)ga_shape_target_single(). (#1853)Improvements
SSDHead. (#1935)SegResizeFlipPadRescale into different existing transforms. (#1852)init_dist() to MMCV. (#1851)keep_all_stages in HTC and Cascade R-CNN. (#1806)New Features
crop_mask and rle_mask_encode for mask heads. (#2013)The RC1 release mainly focuses on improving the user experience, and fixing bugs.
Highlights
Breaking Changes
Bug Fixes
keep_ratio=False. (#1730)build_dataloader. (#1693)nms_cfg in HTC. (#1573)gt_bboxes_ignore is None. (#1498)img_prefix is None. (#1497)grid_anchors and valid_flags. (#1478)Improvements
Expand transform. (#1651)--validate in non-distributed training. (#1624, #1651)in_channels to backbones. (#1475)no_norm_on_lateral in FPN. (#1240).scalar_type() instead of .type() to suppress some warnings. (#1070)New Features
--with_ap to compute the AP for each class. (#1549)train_cfg field in config files are restructured.ConvFCRoIHead / SharedFCRoIHead are renamed to ConvFCBBoxHead / SharedFCBBoxHead for consistency.