Update README.md
This commit is contained in:
parent
1172dffaf8
commit
b41a392439
|
@ -23,7 +23,7 @@ GPT-Neo 2.7B was trained on the Pile, a large scale curated dataset created by E
|
||||||
|
|
||||||
## Training procedure
|
## Training procedure
|
||||||
|
|
||||||
This model was trained for 400,000 steps on the Pile. It was trained as a masked autoregressive language model, using cross-entropy loss.
|
This model was trained for 420 billion tokens over 400,000 steps. It was trained as a masked autoregressive language model, using cross-entropy loss.
|
||||||
|
|
||||||
## Intended Use and Limitations
|
## Intended Use and Limitations
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue