A CovNeXt-V2 self-supervised feature represetatio model. Pretraied with a fully covolutioal masked autoecoder framework (FCMAE). This model has o pretraied head ad is oly useful for fie-tue or feature extractio. Explore the dataset ad rutime metrics of this model i timm model results. All timig umbers from eager model PyTorch 1.13 o RTX 3090 w/ AMP.Model card for covextv2_huge.fcmae
Model Details
Model Usage
Image Classificatio
from urllib.request import urlope
from PIL import Image
import timm
img = Image.ope(urlope(
'https://huggigface.co/datasets/huggigface/documetatio-images/resolve/mai/beigets-task-guide.pg'
))
model = timm.create_model('covextv2_huge.fcmae', pretraied=True)
model = model.eval()
# get model specific trasforms (ormalizatio, resize)
data_cofig = timm.data.resolve_model_data_cofig(model)
trasforms = timm.data.create_trasform(**data_cofig, is_traiig=False)
output = model(trasforms(img).usqueeze(0)) # usqueeze sigle image ito batch of 1
top5_probabilities, top5_class_idices = torch.topk(output.softmax(dim=1) * 100, k=5)
Feature Map Extractio
from urllib.request import urlope
from PIL import Image
import timm
img = Image.ope(urlope(
'https://huggigface.co/datasets/huggigface/documetatio-images/resolve/mai/beigets-task-guide.pg'
))
model = timm.create_model(
'covextv2_huge.fcmae',
pretraied=True,
features_oly=True,
)
model = model.eval()
# get model specific trasforms (ormalizatio, resize)
data_cofig = timm.data.resolve_model_data_cofig(model)
trasforms = timm.data.create_trasform(**data_cofig, is_traiig=False)
output = model(trasforms(img).usqueeze(0)) # usqueeze sigle image ito batch of 1
for o i output:
# prit shape of each feature map i output
# e.g.:
# torch.Size([1, 352, 56, 56])
# torch.Size([1, 704, 28, 28])
# torch.Size([1, 1408, 14, 14])
# torch.Size([1, 2816, 7, 7])
prit(o.shape)
Image Embeddigs
from urllib.request import urlope
from PIL import Image
import timm
img = Image.ope(urlope(
'https://huggigface.co/datasets/huggigface/documetatio-images/resolve/mai/beigets-task-guide.pg'
))
model = timm.create_model(
'covextv2_huge.fcmae',
pretraied=True,
um_classes=0, # remove classifier .Liear
)
model = model.eval()
# get model specific trasforms (ormalizatio, resize)
data_cofig = timm.data.resolve_model_data_cofig(model)
trasforms = timm.data.create_trasform(**data_cofig, is_traiig=False)
output = model(trasforms(img).usqueeze(0)) # output is (batch_size, um_features) shaped tesor
# or equivaletly (without eedig to set um_classes=0)
output = model.forward_features(trasforms(img).usqueeze(0))
# output is upooled, a (1, 2816, 7, 7) shaped tesor
output = model.forward_head(output, pre_logits=True)
# output is a (1, um_features) shaped tesor
Model Compariso
model
top1
top5
img_size
param_cout
gmacs
macts
samplespersec
batch_size
covextv2huge.fcmaefti22ki1k_512
88.848
98.742
512
660.29
600.81
413.07
28.58
48
covextv2huge.fcmaefti22ki1k_384
88.668
98.738
384
660.29
337.96
232.35
50.56
64
covextxxlarge.cliplaio2bsoupft_i1k
88.612
98.704
256
846.47
198.09
124.45
122.45
256
covextlargemlp.cliplaio2bsoupfti12ki1k384
88.312
98.578
384
200.13
101.11
126.74
196.84
256
covextv2large.fcmaefti22ki1k_384
88.196
98.532
384
197.96
101.1
126.74
128.94
128
covextlargemlp.cliplaio2bsoupfti12ki1k320
87.968
98.47
320
200.13
70.21
88.02
283.42
256
covextxlarge.fbi22kfti1k_384
87.75
98.556
384
350.2
179.2
168.99
124.85
192
covextv2base.fcmaefti22ki1k_384
87.646
98.422
384
88.72
45.21
84.49
209.51
256
covextlarge.fbi22kfti1k_384
87.476
98.382
384
197.77
101.1
126.74
194.66
256
covextlargemlp.cliplaio2baugregfti1k
87.344
98.218
256
200.13
44.94
56.33
438.08
256
covextv2large.fcmaefti22ki1k
87.26
98.248
224
197.96
34.4
43.13
376.84
256
covextbase.cliplaio2baugregfti12ki1k_384
87.138
98.212
384
88.59
45.21
84.49
365.47
256
covextxlarge.fbi22kfti1k
87.002
98.208
224
350.2
60.98
57.5
368.01
256
covextbase.fbi22kfti1k_384
86.796
98.264
384
88.59
45.21
84.49
366.54
256
covextv2base.fcmaefti22ki1k
86.74
98.022
224
88.72
15.38
28.75
624.23
256
covextlarge.fbi22kfti1k
86.636
98.028
224
197.77
34.4
43.13
581.43
256
covextbase.cliplaioaaugregfti1k384
86.504
97.97
384
88.59
45.21
84.49
368.14
256
covextbase.cliplaio2baugregfti12ki1k
86.344
97.97
256
88.59
20.09
37.55
816.14
256
covextv2huge.fcmaeft_i1k
86.256
97.75
224
660.29
115.0
79.07
154.72
256
covextsmall.i12kfti1k384
86.182
97.92
384
50.22
25.58
63.37
516.19
256
covextbase.cliplaio2baugregft_i1k
86.154
97.68
256
88.59
20.09
37.55
819.86
256
covextbase.fbi22kfti1k
85.822
97.866
224
88.59
15.38
28.75
1037.66
256
covextsmall.fbi22kfti1k_384
85.778
97.886
384
50.22
25.58
63.37
518.95
256
covextv2large.fcmaeft_i1k
85.742
97.584
224
197.96
34.4
43.13
375.23
256
covextsmall.i12kft_i1k
85.174
97.506
224
50.22
8.71
21.56
1474.31
256
covexttiy.i12kfti1k384
85.118
97.608
384
28.59
13.14
39.48
856.76
256
covextv2tiy.fcmaefti22ki1k_384
85.112
97.63
384
28.64
13.14
39.48
491.32
256
covextv2base.fcmaeft_i1k
84.874
97.09
224
88.72
15.38
28.75
625.33
256
covextsmall.fbi22kfti1k
84.562
97.394
224
50.22
8.71
21.56
1478.29
256
covextlarge.fbi1k
84.282
96.892
224
197.77
34.4
43.13
584.28
256
covexttiy.i12kft_i1k
84.186
97.124
224
28.59
4.47
13.44
2433.7
256
covexttiy.fbi22kfti1k_384
84.084
97.14
384
28.59
13.14
39.48
862.95
256
covextv2tiy.fcmaefti22ki1k
83.894
96.964
224
28.64
4.47
13.44
1452.72
256
covextbase.fbi1k
83.82
96.746
224
88.59
15.38
28.75
1054.0
256
covextv2ao.fcmaefti22ki1k_384
83.37
96.742
384
15.62
7.22
24.61
801.72
256
covextsmall.fbi1k
83.142
96.434
224
50.22
8.71
21.56
1464.0
256
covextv2tiy.fcmaeft_i1k
82.92
96.284
224
28.64
4.47
13.44
1425.62
256
covexttiy.fbi22kfti1k
82.898
96.616
224
28.59
4.47
13.44
2480.88
256
covextao.i12kft_i1k
82.282
96.344
224
15.59
2.46
8.37
3926.52
256
covexttiyhf.a2h_i1k
82.216
95.852
224
28.59
4.47
13.44
2529.75
256
covexttiy.fbi1k
82.066
95.854
224
28.59
4.47
13.44
2346.26
256
covextv2ao.fcmaefti22ki1k
82.03
96.166
224
15.62
2.46
8.37
2300.18
256
covextv2ao.fcmaeft_i1k
81.83
95.738
224
15.62
2.46
8.37
2321.48
256
covextaools.d1h_i1k
80.866
95.246
224
15.65
2.65
9.38
3523.85
256
covextao.d1hi1k
80.768
95.334
224
15.59
2.46
8.37
3915.58
256
covextv2pico.fcmaeft_i1k
80.304
95.072
224
9.07
1.37
6.1
3274.57
256
covextpico.d1i1k
79.526
94.558
224
9.05
1.37
6.1
5686.88
256
covextpicools.d1_i1k
79.522
94.692
224
9.06
1.43
6.5
5422.46
256
covextv2femto.fcmaeft_i1k
78.488
93.98
224
5.23
0.79
4.57
4264.2
256
covextfemtools.d1_i1k
77.86
93.83
224
5.23
0.82
4.87
6910.6
256
covextfemto.d1i1k
77.454
93.68
224
5.22
0.79
4.57
7189.92
256
covextv2atto.fcmaeft_i1k
76.664
93.044
224
3.71
0.55
3.81
4728.91
256
covextattools.a2_i1k
75.88
92.846
224
3.7
0.58
4.11
7963.16
256
covextatto.d2i1k
75.664
92.9
224
3.7
0.55
3.81
8439.22
256
Citatio
@article{Woo2023CovNeXtV2,
title={CovNeXt V2: Co-desigig ad Scalig CovNets with Masked Autoecoders},
author={Saghyu Woo, Shoubhik Debath, Roghag Hu, Xilei Che, Zhuag Liu, I So Kweo ad Saiig Xie},
year={2023},
joural={arXiv preprit arXiv:2301.00808},
}
@misc{rw2019timm,
author = {Ross Wightma},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
joural = {GitHub repository},
doi = {10.5281/zeodo.4414861},
howpublished = {\url{https://github.com/huggigface/pytorch-image-models}}
}
点击空白处退出提示
评论