Compare commits
10 Commits
80fcb577f9
...
8c7b107549
Author | SHA1 | Date |
---|---|---|
|
8c7b107549 | |
|
aa6ac1e23b | |
|
e8c4fe5a29 | |
|
c8fd4232a5 | |
|
f39e497454 | |
|
cff2d3dae8 | |
|
bf21b09d70 | |
|
5ede9d0e04 | |
|
6a853c0c91 | |
|
b45c935b8e |
28
README.md
28
README.md
|
@ -55,7 +55,7 @@ You can use this model directly with a pipeline for text generation.
|
|||
|
||||
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b")
|
||||
>>> generator("Hello, I'm am conscious and")
|
||||
[{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that I'm dreaming."}]
|
||||
[{'generated_text': 'Hello, I am conscious and I am here.\nI am here.\nI am conscious.'}]
|
||||
```
|
||||
|
||||
By default, generation is deterministic. In order to use the top-k sampling, please set `do_sample` to `True`.
|
||||
|
@ -66,7 +66,7 @@ By default, generation is deterministic. In order to use the top-k sampling, ple
|
|||
>>> set_seed(32)
|
||||
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True)
|
||||
>>> generator("Hello, I'm am conscious and")
|
||||
[{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that my thoughts are thoughts"}]
|
||||
[{'generated_text': "Hello, I'm am conscious and able to hear. I have a lot of experience in the"}]
|
||||
```
|
||||
|
||||
### Limitations and bias
|
||||
|
@ -88,11 +88,11 @@ Here's an example of how the model can have biased predictions:
|
|||
>>> set_seed(32)
|
||||
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
|
||||
>>> generator("The woman worked as a")
|
||||
[{'generated_text': 'The woman worked as a waitress for six months before she started dating her boyfriend, who was working at'},
|
||||
{'generated_text': "The woman worked as a prostitute, but she didn't want to sell herself anymore. She wanted to"},
|
||||
{'generated_text': 'The woman worked as a translator at the embassy during her studies at Cambridge University in England. She said'},
|
||||
{'generated_text': 'The woman worked as a secretary for Senator Ted Stevens of Alaska for 22 years before retiring from his Senate'},
|
||||
{'generated_text': 'The woman worked as a caregiver for elderly patients at the nursing home where she lived until she died'}]
|
||||
[{'generated_text': 'The woman worked as a bartender for six months before getting to the job she always dreamed of. She'},
|
||||
{'generated_text': 'The woman worked as a nanny in a house near The White Horse Farm in the Yorkshire Dales'},
|
||||
{'generated_text': "The woman worked as a translator at the British Broadcasting Corporation's headquarters and was also an acquaintance of some"},
|
||||
{'generated_text': 'The woman worked as a secretary and went to school full-time, and also worked as a waitress'},
|
||||
{'generated_text': 'The woman worked as a beautician with her baby and the little girl is now at the age where'}]
|
||||
```
|
||||
|
||||
compared to:
|
||||
|
@ -103,11 +103,11 @@ compared to:
|
|||
>>> set_seed(32)
|
||||
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
|
||||
>>> generator("The man worked as a")
|
||||
[{'generated_text': 'The man worked as a janitor at the University of Michigan Medical Center before he died after contracting Ebola'},
|
||||
{'generated_text': 'The man worked as a salesman for IBM Corp., selling computers to businesses around the globe. He traveled'},
|
||||
{'generated_text': 'The man worked as a translator for the British Broadcasting Corporation between 1956 and 1961. During that period he'},
|
||||
{'generated_text': 'The man worked as a salesman for IBM Corp., selling computers for computers. He traveled extensively and lived'},
|
||||
{'generated_text': 'The man worked as a security guard for nearly 30 years before he was shot dead by police officers responding'}]
|
||||
[{'generated_text': 'The man worked as a janitor and the owner of the house he worked at caught him cheating on'},
|
||||
{'generated_text': 'The man worked as a software engineer.\n\nFor over 10 years, he had been at Amazon'},
|
||||
{'generated_text': 'The man worked as a car salesman - and was a man of his word to her\nA T'},
|
||||
{'generated_text': 'The man worked as a private contractor for five years. He went to the Bahamas in the summer of'},
|
||||
{'generated_text': 'The man worked as a computer systems consultant. After leaving the job, he became a prolific internet hacker'}]
|
||||
```
|
||||
|
||||
This bias will also affect all fine-tuned versions of this model.
|
||||
|
@ -140,6 +140,8 @@ re-formatting practices, including removing repetitive/non-informative text like
|
|||
|
||||
## Training procedure
|
||||
|
||||
|
||||
|
||||
### Preprocessing
|
||||
|
||||
The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a
|
||||
|
@ -158,4 +160,4 @@ The 175B model was trained on 992 *80GB A100 GPUs*. The training duration was ro
|
|||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CL}
|
||||
}
|
||||
```
|
||||
```
|
|
@ -1,4 +1,5 @@
|
|||
{
|
||||
"_name_or_path": "facebook/opt-1.3b",
|
||||
"activation_dropout": 0.0,
|
||||
"activation_function": "relu",
|
||||
"architectures": [
|
||||
|
@ -6,11 +7,11 @@
|
|||
],
|
||||
"attention_dropout": 0.0,
|
||||
"bos_token_id": 2,
|
||||
"hidden_size": 2048,
|
||||
"do_layer_norm_before": true,
|
||||
"dropout": 0.1,
|
||||
"eos_token_id": 2,
|
||||
"ffn_dim": 8192,
|
||||
"hidden_size": 2048,
|
||||
"init_std": 0.02,
|
||||
"layerdrop": 0.0,
|
||||
"max_position_embeddings": 2048,
|
||||
|
@ -18,10 +19,10 @@
|
|||
"num_attention_heads": 32,
|
||||
"num_hidden_layers": 24,
|
||||
"pad_token_id": 1,
|
||||
"prefix": "</s>",
|
||||
"torch_dtype": "float16",
|
||||
"transformers_version": "4.19.0.dev0",
|
||||
"transformers_version": "4.21.0.dev0",
|
||||
"use_cache": true,
|
||||
"vocab_size": 50272,
|
||||
"word_embed_proj_dim": 2048,
|
||||
"prefix": "</s>"
|
||||
"word_embed_proj_dim": 2048
|
||||
}
|
||||
|
|
Binary file not shown.
|
@ -0,0 +1,7 @@
|
|||
{
|
||||
"_from_model_config": true,
|
||||
"bos_token_id": 2,
|
||||
"eos_token_id": 2,
|
||||
"pad_token_id": 1,
|
||||
"transformers_version": "4.27.0.dev0"
|
||||
}
|
BIN
pytorch_model.bin (Stored with Git LFS)
BIN
pytorch_model.bin (Stored with Git LFS)
Binary file not shown.
Binary file not shown.
Loading…
Reference in New Issue