#L: Context length          R1: Recall@1          R5 Recall@5          R10 Recall@10         
By default, this leaderboard is sorted by R@1 score. To view other sorted results, please click on the corresponding cell.
# | Model | Params | #L | Date | FIBER | FIBER-S | FIBER-T | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Text to Video | Video to Text | Text to Video | Video to Text | Text to Video | Video to Text | |||||||||||||||||
R1 | R5 | R10 | R1 | R5 | R10 | R1 | R5 | R10 | R1 | R5 | R10 | R1 | R5 | R10 | R1 | R5 | R10 | |||||
InternVideo2stage2
Shanghai AI Lab |
1B | 512 | 2024/04/25 | 72.5 | 93.7 | 97.3 | 69.5 | 94.6 | 97.8 | 72.4 | 94.2 | 97.4 | 62.7 | 90.5 | 95.9 | 46.0 | 80.8 | 91.9 | 46.6 | 82.5 | 92.5 | |
InternVL2 (P)
Shanghai AI Lab |
8B | ∞ | 2024/07/04 | 71.6 | 92.2 | 97.0 | 71.6 | 92.8 | 97.0 | 76.1 | 94.1 | 97.6 | 74.3 | 94.5 | 97.6 | 46.8 | 76.8 | 89.1 | 46.1 | 77.5 | 89.5 | |
MiniCPM-V 2.6 (P)
OpenBMB |
8B | ∞ | 2024/08/06 | 71.0 | 92.2 | 97.0 | 69.3 | 92.8 | 97.1 | 71.7 | 93.6 | 98.0 | 67.6 | 92.3 | 97.7 | 50.5 | 82.9 | 92.1 | 46.1 | 80.9 | 93.3 | |
InternVL2 (A)
Shanghai AI Lab |
8B | ∞ | 2024/07/04 | 68.8 | 90.6 | 95.8 | 62.3 | 87.6 | 93.2 | 71.2 | 92.4 | 96.3 | 66.8 | 89.8 | 94.6 | 42.6 | 76.8 | 87.7 | 41.8 | 74.0 | 86.6 | |
LLaVA NeXT Video (P)
LLaVA NeXT Team |
7B | ∞ | 2024/05/10 | 66.9 | 89.4 | 96.0 | 62.7 | 89.2 | 95.4 | 68.0 | 92.0 | 96.2 | 65.0 | 90.0 | 95.9 | 43.3 | 76.9 | 88.9 | 40.1 | 75.4 | 88.7 | |
LanguageBind
Peking University |
528M | 77 | 2023/10/07 | 64.3 | 91.0 | 96.3 | 59.5 | 88.0 | 95.0 | 64.7 | 90.8 | 96.8 | 61.0 | 87.2 | 94.5 | 39.8 | 77.3 | 90.5 | 42.2 | 77.6 | 91.7 | |
Long-CLIP L/14
Shanghai AI Lab |
428M | 248 | 2024/03/22 | 62.7 | 88.8 | 95.7 | 60.3 | 88.8 | 94.9 | 65.6 | 90.9 | 96.0 | 61.0 | 88.3 | 94.4 | 33.2 | 68.8 | 81.6 | 34.5 | 71.9 | 86.6 | |
MiniCPM-V 2.6 (A)
OpenBMB |
8B | ∞ | 2024/08/06 | 62.7 | 90.0 | 95.8 | 58.8 | 88.9 | 95.7 | 63.6 | 90.5 | 96.0 | 62.4 | 90.3 | 96.2 | 44.9 | 80.8 | 91.2 | 41.2 | 77.9 | 90.6 | |
Long-CLIP B/16
Shanghai AI Lab |
150M | 248 | 2024/03/22 | 59.2 | 85.3 | 92.1 | 55.8 | 84.7 | 92.9 | 62.5 | 86.0 | 92.7 | 53.8 | 84.1 | 92.7 | 32.0 | 65.4 | 79.3 | 29.7 | 67.3 | 84.1 | |
LLaVA NeXT Video (A)
LLaVA NeXT Team |
7B | ∞ | 2024/05/10 | 52.2 | 84.3 | 91.3 | 53.7 | 82.9 | 90.5 | 57.0 | 86.1 | 94.1 | 55.0 | 83.5 | 91.9 | 36.1 | 70.6 | 84.7 | 34.7 | 67.5 | 83.1 | |
CLIP L/14
OpenAI |
428M | 77 | 2021/02/26 | 51.2 | 83.4 | 90.6 | 54.7 | 86.9 | 93.6 | 49.0 | 81.9 | 91.4 | 55.4 | 85.6 | 93.0 | 33.5 | 70.3 | 84.0 | 39.7 | 76.2 | 87.9 | |
CLIP B/16
OpenAI |
150M | 77 | 2021/02/26 | 45.7 | 79.6 | 89.1 | 48.4 | 82.4 | 90.8 | 45.6 | 79.0 | 89.2 | 47.6 | 80.9 | 90.8 | 30.3 | 65.1 | 79.8 | 35.8 | 71.0 | 85.8 |
Date indicates the release date of open-source models          ∞ indicates a very large number