Update README.md

This commit is contained in:
Niels Rogge 2022-05-02 12:56:22 +00:00 committed by huggingface-web
parent 471e3ab513
commit 64cbfbc5d5
1 changed files with 1 additions and 1 deletions

View File

@ -21,7 +21,7 @@ Disclaimer: The team releasing YOLOS did not write a model card for this model s
## Model description
YOLOS is a Vision Transformer (ViT) trained using the DETR loss. Despite its simplicity, the model is able to achieve 42 AP on COCO validation 2017.
YOLOS is a Vision Transformer (ViT) trained using the DETR loss. Despite its simplicity, a base-sized YOLOS model is able to achieve 42 AP on COCO validation 2017 (similar to DETR and more complex frameworks such as Faster R-CNN.
The model is trained using a "bipartite matching loss": one compares the predicted classes + bounding boxes of each of the N = 100 object queries to the ground truth annotations, padded up to the same length N (so if an image only contains 4 objects, 96 annotations will just have a "no object" as class and "no bounding box" as bounding box). The Hungarian matching algorithm is used to create an optimal one-to-one mapping between each of the N queries and each of the N annotations. Next, standard cross-entropy (for the classes) and a linear combination of the L1 and generalized IoU loss (for the bounding boxes) are used to optimize the parameters of the model.