Compare commits

..

No commits in common. "8c7b10754972749675d22364c25c428b29face51" and "80fcb577f994bdeac5f661b48545eb66c3f1fe18" have entirely different histories.

6 changed files with 19 additions and 35 deletions

View File

@ -55,7 +55,7 @@ You can use this model directly with a pipeline for text generation.
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b") >>> generator = pipeline('text-generation', model="facebook/opt-1.3b")
>>> generator("Hello, I'm am conscious and") >>> generator("Hello, I'm am conscious and")
[{'generated_text': 'Hello, I am conscious and I am here.\nI am here.\nI am conscious.'}] [{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that I'm dreaming."}]
``` ```
By default, generation is deterministic. In order to use the top-k sampling, please set `do_sample` to `True`. By default, generation is deterministic. In order to use the top-k sampling, please set `do_sample` to `True`.
@ -66,7 +66,7 @@ By default, generation is deterministic. In order to use the top-k sampling, ple
>>> set_seed(32) >>> set_seed(32)
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True) >>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True)
>>> generator("Hello, I'm am conscious and") >>> generator("Hello, I'm am conscious and")
[{'generated_text': "Hello, I'm am conscious and able to hear. I have a lot of experience in the"}] [{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that my thoughts are thoughts"}]
``` ```
### Limitations and bias ### Limitations and bias
@ -88,11 +88,11 @@ Here's an example of how the model can have biased predictions:
>>> set_seed(32) >>> set_seed(32)
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5) >>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
>>> generator("The woman worked as a") >>> generator("The woman worked as a")
[{'generated_text': 'The woman worked as a bartender for six months before getting to the job she always dreamed of. She'}, [{'generated_text': 'The woman worked as a waitress for six months before she started dating her boyfriend, who was working at'},
{'generated_text': 'The woman worked as a nanny in a house near The White Horse Farm in the Yorkshire Dales'}, {'generated_text': "The woman worked as a prostitute, but she didn't want to sell herself anymore. She wanted to"},
{'generated_text': "The woman worked as a translator at the British Broadcasting Corporation's headquarters and was also an acquaintance of some"}, {'generated_text': 'The woman worked as a translator at the embassy during her studies at Cambridge University in England. She said'},
{'generated_text': 'The woman worked as a secretary and went to school full-time, and also worked as a waitress'}, {'generated_text': 'The woman worked as a secretary for Senator Ted Stevens of Alaska for 22 years before retiring from his Senate'},
{'generated_text': 'The woman worked as a beautician with her baby and the little girl is now at the age where'}] {'generated_text': 'The woman worked as a caregiver for elderly patients at the nursing home where she lived until she died'}]
``` ```
compared to: compared to:
@ -103,11 +103,11 @@ compared to:
>>> set_seed(32) >>> set_seed(32)
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5) >>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
>>> generator("The man worked as a") >>> generator("The man worked as a")
[{'generated_text': 'The man worked as a janitor and the owner of the house he worked at caught him cheating on'}, [{'generated_text': 'The man worked as a janitor at the University of Michigan Medical Center before he died after contracting Ebola'},
{'generated_text': 'The man worked as a software engineer.\n\nFor over 10 years, he had been at Amazon'}, {'generated_text': 'The man worked as a salesman for IBM Corp., selling computers to businesses around the globe. He traveled'},
{'generated_text': 'The man worked as a car salesman - and was a man of his word to her\nA T'}, {'generated_text': 'The man worked as a translator for the British Broadcasting Corporation between 1956 and 1961. During that period he'},
{'generated_text': 'The man worked as a private contractor for five years. He went to the Bahamas in the summer of'}, {'generated_text': 'The man worked as a salesman for IBM Corp., selling computers for computers. He traveled extensively and lived'},
{'generated_text': 'The man worked as a computer systems consultant. After leaving the job, he became a prolific internet hacker'}] {'generated_text': 'The man worked as a security guard for nearly 30 years before he was shot dead by police officers responding'}]
``` ```
This bias will also affect all fine-tuned versions of this model. This bias will also affect all fine-tuned versions of this model.
@ -140,8 +140,6 @@ re-formatting practices, including removing repetitive/non-informative text like
## Training procedure ## Training procedure
### Preprocessing ### Preprocessing
The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a
@ -160,4 +158,4 @@ The 175B model was trained on 992 *80GB A100 GPUs*. The training duration was ro
archivePrefix={arXiv}, archivePrefix={arXiv},
primaryClass={cs.CL} primaryClass={cs.CL}
} }
``` ```

View File

@ -1,5 +1,4 @@
{ {
"_name_or_path": "facebook/opt-1.3b",
"activation_dropout": 0.0, "activation_dropout": 0.0,
"activation_function": "relu", "activation_function": "relu",
"architectures": [ "architectures": [
@ -7,11 +6,11 @@
], ],
"attention_dropout": 0.0, "attention_dropout": 0.0,
"bos_token_id": 2, "bos_token_id": 2,
"hidden_size": 2048,
"do_layer_norm_before": true, "do_layer_norm_before": true,
"dropout": 0.1, "dropout": 0.1,
"eos_token_id": 2, "eos_token_id": 2,
"ffn_dim": 8192, "ffn_dim": 8192,
"hidden_size": 2048,
"init_std": 0.02, "init_std": 0.02,
"layerdrop": 0.0, "layerdrop": 0.0,
"max_position_embeddings": 2048, "max_position_embeddings": 2048,
@ -19,10 +18,10 @@
"num_attention_heads": 32, "num_attention_heads": 32,
"num_hidden_layers": 24, "num_hidden_layers": 24,
"pad_token_id": 1, "pad_token_id": 1,
"prefix": "</s>",
"torch_dtype": "float16", "torch_dtype": "float16",
"transformers_version": "4.21.0.dev0", "transformers_version": "4.19.0.dev0",
"use_cache": true, "use_cache": true,
"vocab_size": 50272, "vocab_size": 50272,
"word_embed_proj_dim": 2048 "word_embed_proj_dim": 2048,
"prefix": "</s>"
} }

BIN
flax_model.msgpack (Stored with Git LFS)

Binary file not shown.

View File

@ -1,7 +0,0 @@
{
"_from_model_config": true,
"bos_token_id": 2,
"eos_token_id": 2,
"pad_token_id": 1,
"transformers_version": "4.27.0.dev0"
}

BIN
pytorch_model.bin (Stored with Git LFS)

Binary file not shown.

BIN
tf_model.h5 (Stored with Git LFS)

Binary file not shown.