Update README.md
This commit is contained in:
parent
9b6b447418
commit
8036be3f78
|
@ -18,8 +18,7 @@ Donut consists of a vision encoder (Swin Transformer) and a text decoder (BART).
|
|||
|
||||
## Intended uses & limitations
|
||||
|
||||
You can use the raw model for image classification. See the [model hub](https://huggingface.co/models?search=google/vit) to look for
|
||||
fine-tuned versions on a task that interests you.
|
||||
This model is meant to be fine-tuned on a downstream task, like document image classification or document parsing. See the [model hub](https://huggingface.co/models?search=donut) to look for fine-tuned versions on a task that interests you.
|
||||
|
||||
### How to use
|
||||
|
||||
|
|
Loading…
Reference in New Issue