diff --git a/README.md b/README.md
index 6383fd4..331f32e 100644
--- a/README.md
+++ b/README.md
@@ -18,6 +18,10 @@ SegFormer model fine-tuned on ADE20k at resolution 640x640. It was introduced in
 
 Disclaimer: The team releasing SegFormer did not write a model card for this model so this model card has been written by the Hugging Face team.
 
+## Model description
+
+SegFormer consists of a hierarchical Transformer encoder and a lightweight all-MLP decode head to achieve great results on semantic segmentation benchmarks such as ADE20K and Cityscapes. The hierarchical Transformer is first pre-trained on ImageNet-1k, after which a decode head is added and fine-tuned altogether on a downstream dataset.
+
 ## Intended uses & limitations
 
 You can use the raw model for semantic segmentation. See the [model hub](https://huggingface.co/models?other=segformer) to look for fine-tuned versions on a task that interests you.