Back to Unilm

Flickr30k Entities

kosmos-2/evaluation/flickr_entities/README.md

latest5.1 KB
Original Source

Flickr30k Entities

Results

  • Results of Kosmos-2 on Flickr30K Entities val split.
Recall@kallanimalsbodypartsclothinginstrumentsotherpeoplescenevehicles
Recall@10.7783551583177440.90822179732313580.414048059149722760.7087794432548180.76774193548387090.64707673568818510.88955788749354890.80626223091976510.8668639053254438
Recall@50.79242014827132270.91586998087954110.425138632162661730.72205567451820130.78709677419354840.66321559074299630.90727679339411660.80756686236138290.8698224852071006
Recall@100.79255871960091460.91586998087954110.425138632162661730.72205567451820130.78709677419354840.66352009744214370.90744882160674360.80756686236138290.8698224852071006
  • Results of Kosmos-2 on Flickr30K Entities test split.
Recall@kallanimalsbodypartsclothinginstrumentsotherpeoplescenevehicles
Recall@10.78716939437884130.9169884169884170.40917782026768640.73547267996530790.76543209876543210.65056312981624190.89515558698727020.82211241507103150.9125
Recall@50.80111870727159730.92471042471042470.41300191204588910.74371205550737210.77777777777777780.66152934202726730.91920084865629430.82334774552192710.9125
Recall@100.80139493128927560.92471042471042470.41300191204588910.74414570685169120.77777777777777780.66152934202726730.91973125884016970.82334774552192710.9125

Data preparation

1. Download image and annotations

2. Convert data format

You can run the following command to convert the data format from MDETR pre-processed annotations:

python
python cook_data.py /path/to/mdetr_annotations /path/to/flickr-images

Alternatively, you also can download the pre-processed files: val split and test split. Remember to replace the image path in our provided files with the specific image path on your machine.

Evaluation

You can run the following command to evaluate Kosmos-2 on Flickr30k Entities:

bash
cd unilm/kosmos-2

# val split
bash evaluation/grd-zeroshot-flickr.sh 0 32 /path/to/kosmos-2.pt /path/to/final_flickr_separateGT_val.json.inline.locout /path/to/final_flickr_separateGT_val.json /path/to/flickr30k_entities

# test split
bash evaluation/grd-zeroshot-flickr.sh 0 32 /path/to/kosmos-2.pt /path/to/final_flickr_separateGT_test.json.inline.locout /path/to/final_flickr_separateGT_test.json /path/to/flickr30k_entities

where final_flickr_separateGT_val.json can be found after downloading and uncompressing the MDETR annotations, final_flickr_separateGT_test.json.inline.locout can be downloaded or generated in here, and /path/to/flickr30k_entities is the path where you cloned the official Flickr30k annotations.

Alternatively, download our provided evaluation results (val_split_ouput and test_split_ouput) and then evaluate:

python
python evaluation/flickr/flickr_entities_evaluate.py /path/to/eval_result --annotation_file /path/to/final_flickr_separateGT_val.json --flickr_entities_path /path/to/flickr30k_entities

final_flickr_separateGT_val.json can be found after downloading and uncompressing the MDETR annotations; /path/to/flickr30k_entities is the path where you cloned official Flickr30k annotations.