@ -1,8 +1,5 @@
---
license: apache-2.0
tags:
datasets:
- imagenet-21k
# Vision-and-Language Transformer (ViLT), fine-tuned on VQAv2