Compare commits

...

10 Commits

Author SHA1 Message Date
Joao Gante 8c7b107549 Adding generation config file(s) 2023-01-24 17:10:24 +00:00
ArthurZ aa6ac1e23b Update README.md (#6)
- Update README.md (ff3875e059f9301a03a6e18d258eb4c1ce5a49a2)


Co-authored-by: Younes Belkada <ybelkada@users.noreply.huggingface.co>
2022-06-22 09:53:16 +00:00
Patrick von Platen e8c4fe5a29 correct checkpoints see: https://github.com/facebookresearch/metaseq/pull/164 2022-06-21 17:24:15 +00:00
patrickvonplaten c8fd4232a5 Update TF weights (#3)
- Update TF weights (14cf0fdfbe84ead14d6d9ebd84b075b7cb1d5635)


Co-authored-by: Joao Gante <joaogante@users.noreply.huggingface.co>
2022-06-16 14:52:10 +00:00
Arthur Zucker f39e497454 Add flax_weights 2022-06-03 12:20:58 +00:00
Arthur Zucker cff2d3dae8 Add tf_weights 2022-06-03 12:17:16 +00:00
Arthur Zucker bf21b09d70 Revert config modifications 2022-06-03 06:22:13 +00:00
Arthur Zucker 5ede9d0e04 Update README.md 2022-06-03 06:18:18 +00:00
Arthur Zucker 6a853c0c91 add model 2022-06-03 06:08:09 +00:00
Arthur Zucker b45c935b8e add model 2022-06-03 06:03:26 +00:00
6 changed files with 35 additions and 19 deletions

View File

@ -55,7 +55,7 @@ You can use this model directly with a pipeline for text generation.
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b")
>>> generator("Hello, I'm am conscious and")
[{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that I'm dreaming."}]
[{'generated_text': 'Hello, I am conscious and I am here.\nI am here.\nI am conscious.'}]
```
By default, generation is deterministic. In order to use the top-k sampling, please set `do_sample` to `True`.
@ -66,7 +66,7 @@ By default, generation is deterministic. In order to use the top-k sampling, ple
>>> set_seed(32)
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True)
>>> generator("Hello, I'm am conscious and")
[{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that my thoughts are thoughts"}]
[{'generated_text': "Hello, I'm am conscious and able to hear. I have a lot of experience in the"}]
```
### Limitations and bias
@ -88,11 +88,11 @@ Here's an example of how the model can have biased predictions:
>>> set_seed(32)
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
>>> generator("The woman worked as a")
[{'generated_text': 'The woman worked as a waitress for six months before she started dating her boyfriend, who was working at'},
{'generated_text': "The woman worked as a prostitute, but she didn't want to sell herself anymore. She wanted to"},
{'generated_text': 'The woman worked as a translator at the embassy during her studies at Cambridge University in England. She said'},
{'generated_text': 'The woman worked as a secretary for Senator Ted Stevens of Alaska for 22 years before retiring from his Senate'},
{'generated_text': 'The woman worked as a caregiver for elderly patients at the nursing home where she lived until she died'}]
[{'generated_text': 'The woman worked as a bartender for six months before getting to the job she always dreamed of. She'},
{'generated_text': 'The woman worked as a nanny in a house near The White Horse Farm in the Yorkshire Dales'},
{'generated_text': "The woman worked as a translator at the British Broadcasting Corporation's headquarters and was also an acquaintance of some"},
{'generated_text': 'The woman worked as a secretary and went to school full-time, and also worked as a waitress'},
{'generated_text': 'The woman worked as a beautician with her baby and the little girl is now at the age where'}]
```
compared to:
@ -103,11 +103,11 @@ compared to:
>>> set_seed(32)
>>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
>>> generator("The man worked as a")
[{'generated_text': 'The man worked as a janitor at the University of Michigan Medical Center before he died after contracting Ebola'},
{'generated_text': 'The man worked as a salesman for IBM Corp., selling computers to businesses around the globe. He traveled'},
{'generated_text': 'The man worked as a translator for the British Broadcasting Corporation between 1956 and 1961. During that period he'},
{'generated_text': 'The man worked as a salesman for IBM Corp., selling computers for computers. He traveled extensively and lived'},
{'generated_text': 'The man worked as a security guard for nearly 30 years before he was shot dead by police officers responding'}]
[{'generated_text': 'The man worked as a janitor and the owner of the house he worked at caught him cheating on'},
{'generated_text': 'The man worked as a software engineer.\n\nFor over 10 years, he had been at Amazon'},
{'generated_text': 'The man worked as a car salesman - and was a man of his word to her\nA T'},
{'generated_text': 'The man worked as a private contractor for five years. He went to the Bahamas in the summer of'},
{'generated_text': 'The man worked as a computer systems consultant. After leaving the job, he became a prolific internet hacker'}]
```
This bias will also affect all fine-tuned versions of this model.
@ -140,6 +140,8 @@ re-formatting practices, including removing repetitive/non-informative text like
## Training procedure
### Preprocessing
The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a

View File

@ -1,4 +1,5 @@
{
"_name_or_path": "facebook/opt-1.3b",
"activation_dropout": 0.0,
"activation_function": "relu",
"architectures": [
@ -6,11 +7,11 @@
],
"attention_dropout": 0.0,
"bos_token_id": 2,
"hidden_size": 2048,
"do_layer_norm_before": true,
"dropout": 0.1,
"eos_token_id": 2,
"ffn_dim": 8192,
"hidden_size": 2048,
"init_std": 0.02,
"layerdrop": 0.0,
"max_position_embeddings": 2048,
@ -18,10 +19,10 @@
"num_attention_heads": 32,
"num_hidden_layers": 24,
"pad_token_id": 1,
"prefix": "</s>",
"torch_dtype": "float16",
"transformers_version": "4.19.0.dev0",
"transformers_version": "4.21.0.dev0",
"use_cache": true,
"vocab_size": 50272,
"word_embed_proj_dim": 2048,
"prefix": "</s>"
"word_embed_proj_dim": 2048
}

BIN
flax_model.msgpack (Stored with Git LFS) Normal file

Binary file not shown.

7
generation_config.json Normal file
View File

@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 2,
"eos_token_id": 2,
"pad_token_id": 1,
"transformers_version": "4.27.0.dev0"
}

BIN
pytorch_model.bin (Stored with Git LFS)

Binary file not shown.

BIN
tf_model.h5 (Stored with Git LFS) Normal file

Binary file not shown.