diff --git a/README.md b/README.md index 74c0641..31b888a 100644 --- a/README.md +++ b/README.md @@ -20,7 +20,7 @@ Disclaimer: The team releasing DPT did not write a model card for this model so ## Model description -DPT uses the Vision Transformer (ViT) as backbone and adds a neck + head on top for monocular depth estimation. +DPT-Hybrid uses the Vision Transformer Hybrid (ViT-Hybrid) as backbone and adds a neck + head on top for monocular depth estimation. ![model image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/dpt_architecture.jpg)