fix typos (#6)

- fix typos (c2a5e573587885ce23744cf330ee7c402f0df16f)


Co-authored-by: George Ogden <George-Ogden@users.noreply.huggingface.co>
This commit is contained in:
Sylvain Gugger 2023-03-06 15:14:53 +00:00 committed by system
parent ff46155979
commit bc2764f8af
1 changed files with 3 additions and 3 deletions

View File

@ -42,7 +42,7 @@ interests you.
Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked)
to make decisions, such as sequence classification, token classification or question answering. For tasks such as text
generation you should look at model like GPT2.
generation you should look at a model like GPT2.
### How to use
@ -166,14 +166,14 @@ The RoBERTa model was pretrained on the reunion of five datasets:
- [Stories](https://arxiv.org/abs/1806.02847) a dataset containing a subset of CommonCrawl data filtered to match the
story-like style of Winograd schemas.
Together theses datasets weight 160GB of text.
Together these datasets weigh 160GB of text.
## Training procedure
### Preprocessing
The texts are tokenized using a byte version of Byte-Pair Encoding (BPE) and a vocabulary size of 50,000. The inputs of
the model take pieces of 512 contiguous token that may span over documents. The beginning of a new document is marked
the model take pieces of 512 contiguous tokens that may span over documents. The beginning of a new document is marked
with `<s>` and the end of one by `</s>`
The details of the masking procedure for each sentence are the following: