Back to Pytorch Image Models

HParams

hfdocs/source/hparams.mdx

1.0.263.5 KB
Original Source

HParams

Over the years, many timm models have been trained with various hyper-parameters as the libraries and models evolved. I don't have a record of every instance, but have recorded instances of many that can serve as a very good starting point.

Tags

Most timm trained models have an identifier in their pretrained tag that relates them (roughly) to a family / version of hparams I've used over the years.

Tag(s)DescriptionOptimizerLR ScheduleOther Notes
a1hBased on ResNet Strikes Back A1 recipeLAMBCosine with warmupStronger dropout, stochastic depth, and RandAugment than paper A1 recipe
ahBased on ResNet Strikes Back A1 recipeLAMBCosine with warmupNo CutMix. Stronger dropout, stochastic depth, and RandAugment than paper A1 recipe
a1, a2, a3ResNet Strikes Back A{1,2,3} recipeLAMB with BCE lossCosine with warmup
b1, b2, b1k, b2kBased on ResNet Strikes Back B recipe (equivalent to timm RA2 recipes)RMSProp (TF 1.0 behaviour)Step (exponential decay w/ staircase) with warmup
c, c1, c2, c3Based on ResNet Strikes Back C recipesSGD (Nesterov) with AGCCosine with warmup
chBased on ResNet Strikes Back C recipesSGD (Nesterov) with AGCCosine with warmupStronger dropout, stochastic depth, and RandAugment than paper C1/C2 recipes
d, d1, d2Based on ResNet Strikes Back D recipeAdamW with BCE lossCosine with warmup
swBased on Swin Transformer train/pretrain recipe (basis of DeiT and ConvNeXt recipes)AdamW with gradient clipping, EMACosine with warmup
ra, ra2, ra3, racm, raaRandAugment recipes. Inspired by EfficientNet RandAugment recipes. Covered by B recipe in ResNet Strikes Back.RMSProp (TF 1.0 behaviour), EMAStep (exponential decay w/ staircase) with warmup
ra4RandAugment v4. Inspired by MobileNetV4 hparams.-
amAugMix recipeSGD (Nesterov) with JSD lossCosine with warmup
ramAugMix (with RandAugment) recipeSGD (Nesterov) with JSD lossCosine with warmup
btBag-of-Tricks recipeSGD (Nesterov)Cosine with warmup

Config File Gists

I've collected several of the hparam families in a series of gists. These can be downloaded and used with the --config hparam.yaml argument with the timm train script. Some adjustment is always required for the LR vs effective global batch size.

TagKey Model ArchitecturesGist Link
ra2ResNet, EfficientNet, RegNet, NFNetLink
ra3RegNetLink
ra4MobileNetV4Link
swViT, ConvNeXt, CoAtNet, MaxViTLink
sbbViTLink
Tiny Test ModelsLink