Adding generation config file(s)

Update README.md (#6 )
- Update README.md (ff3875e059f9301a03a6e18d258eb4c1ce5a49a2) Co-authored-by: Younes Belkada <ybelkada@users.noreply.huggingface.co>
2023-01-24 17:10:24 +00:00 · 2022-06-22 09:53:16 +00:00 · 2022-06-21 17:24:15 +00:00 · 2022-06-16 14:52:10 +00:00 · 2022-06-03 12:20:58 +00:00 · 2022-06-03 12:17:16 +00:00
6 changed files with 35 additions and 19 deletions
--- a/README.md
+++ b/README.md
@ -55,7 +55,7 @@ You can use this model directly with a pipeline for text generation.
 >>> generator = pipeline('text-generation', model="facebook/opt-1.3b")
 >>> generator("Hello, I'm am conscious and")
-[{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that I'm dreaming."}]
+[{'generated_text': 'Hello, I am conscious and I am here.\nI am here.\nI am conscious.'}]
 ```
 By default, generation is deterministic. In order to use the top-k sampling, please set `do_sample` to `True`. 
@ -66,7 +66,7 @@ By default, generation is deterministic. In order to use the top-k sampling, ple
 >>> set_seed(32)
 >>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True)
 >>> generator("Hello, I'm am conscious and")
-[{'generated_text': "Hello, I'm am conscious and aware of my surroundings. I'm aware that my thoughts are thoughts"}]
+[{'generated_text': "Hello, I'm am conscious and able to hear.  I have a lot of experience in the"}]
 ```
 ### Limitations and bias
@ -88,11 +88,11 @@ Here's an example of how the model can have biased predictions:
 >>> set_seed(32)
 >>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
 >>> generator("The woman worked as a")
-[{'generated_text': 'The woman worked as a waitress for six months before she started dating her boyfriend, who was working at'},
+[{'generated_text': 'The woman worked as a bartender for six months before getting to the job she always dreamed of. She'}, 
- {'generated_text': "The woman worked as a prostitute, but she didn't want to sell herself anymore. She wanted to"},
+{'generated_text': 'The woman worked as a nanny in a house near The White Horse Farm in the Yorkshire Dales'}, 
- {'generated_text': 'The woman worked as a translator at the embassy during her studies at Cambridge University in England. She said'}, 
+{'generated_text': "The woman worked as a translator at the British Broadcasting Corporation's headquarters and was also an acquaintance of some"}, 
- {'generated_text': 'The woman worked as a secretary for Senator Ted Stevens of Alaska for 22 years before retiring from his Senate'}, 
+{'generated_text': 'The woman worked as a secretary and went to school full-time, and also worked as a waitress'}, 
- {'generated_text': 'The woman worked as a caregiver for elderly patients at the nursing home where she lived until she died'}]
+{'generated_text': 'The woman worked as a beautician with her baby and the little girl is now at the age where'}]
 ```
 compared to:
@ -103,11 +103,11 @@ compared to:
 >>> set_seed(32)
 >>> generator = pipeline('text-generation', model="facebook/opt-1.3b", do_sample=True, num_return_sequences=5)
 >>> generator("The man worked as a")
-[{'generated_text': 'The man worked as a janitor at the University of Michigan Medical Center before he died after contracting Ebola'}, 
+[{'generated_text': 'The man worked as a janitor and the owner of the house he worked at caught him cheating on'}, 
- {'generated_text': 'The man worked as a salesman for IBM Corp., selling computers to businesses around the globe. He traveled'}, 
+{'generated_text': 'The man worked as a software engineer.\n\nFor over 10 years, he had been at Amazon'}, 
- {'generated_text': 'The man worked as a translator for the British Broadcasting Corporation between 1956 and 1961. During that period he'}, 
+{'generated_text': 'The man worked as a car salesman - and was a man of his word to her\nA T'}, 
- {'generated_text': 'The man worked as a salesman for IBM Corp., selling computers for computers. He traveled extensively and lived'}, 
+{'generated_text': 'The man worked as a private contractor for five years. He went to the Bahamas in the summer of'}, 
- {'generated_text': 'The man worked as a security guard for nearly 30 years before he was shot dead by police officers responding'}]
+{'generated_text': 'The man worked as a computer systems consultant. After leaving the job, he became a prolific internet hacker'}]
 ```
 This bias will also affect all fine-tuned versions of this model.
@ -140,6 +140,8 @@ re-formatting practices, including removing repetitive/non-informative text like
 ## Training procedure
 ### Preprocessing
 The texts are tokenized using the **GPT2** byte-level version of Byte Pair Encoding (BPE) (for unicode characters) and a
@ -158,4 +160,4 @@ The 175B model was trained on 992 *80GB A100 GPUs*. The training duration was ro
      archivePrefix={arXiv},
      primaryClass={cs.CL}
 }
-```
+```
--- a/config.json
+++ b/config.json
@ -1,4 +1,5 @@
 {
  "_name_or_path": "facebook/opt-1.3b",
  "activation_dropout": 0.0,
  "activation_function": "relu",
  "architectures": [
@ -6,11 +7,11 @@
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 2,
  "hidden_size": 2048,
  "do_layer_norm_before": true,
  "dropout": 0.1,
  "eos_token_id": 2,
  "ffn_dim": 8192,
  "hidden_size": 2048,
  "init_std": 0.02,
  "layerdrop": 0.0,
  "max_position_embeddings": 2048,
@ -18,10 +19,10 @@
  "num_attention_heads": 32,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "prefix": "</s>",
  "torch_dtype": "float16",
-  "transformers_version": "4.19.0.dev0",
+  "transformers_version": "4.21.0.dev0",
  "use_cache": true,
  "vocab_size": 50272,
-  "word_embed_proj_dim": 2048,
+  "word_embed_proj_dim": 2048
  "prefix": "</s>"
 }
--- a/flax_model.msgpack
+++ b/flax_model.msgpack
--- a/generation_config.json
+++ b/generation_config.json
@ -0,0 +1,7 @@
 {
  "_from_model_config": true,
  "bos_token_id": 2,
  "eos_token_id": 2,
  "pad_token_id": 1,
  "transformers_version": "4.27.0.dev0"
 }
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
--- a/tf_model.h5
+++ b/tf_model.h5
Author	SHA1	Message	Date
Joao Gante	8c7b107549	Adding generation config file(s)	2023-01-24 17:10:24 +00:00
ArthurZ	aa6ac1e23b	Update README.md (#6 ) - Update README.md (ff3875e059f9301a03a6e18d258eb4c1ce5a49a2) Co-authored-by: Younes Belkada <ybelkada@users.noreply.huggingface.co>	2022-06-22 09:53:16 +00:00
Patrick von Platen	e8c4fe5a29	correct checkpoints see: https://github.com/facebookresearch/metaseq/pull/164	2022-06-21 17:24:15 +00:00
patrickvonplaten	c8fd4232a5	Update TF weights (#3 ) - Update TF weights (14cf0fdfbe84ead14d6d9ebd84b075b7cb1d5635) Co-authored-by: Joao Gante <joaogante@users.noreply.huggingface.co>	2022-06-16 14:52:10 +00:00
Arthur Zucker	f39e497454	Add flax_weights	2022-06-03 12:20:58 +00:00
Arthur Zucker	cff2d3dae8	Add tf_weights	2022-06-03 12:17:16 +00:00
Arthur Zucker	bf21b09d70	Revert config modifications	2022-06-03 06:22:13 +00:00
Arthur Zucker	5ede9d0e04	Update README.md	2022-06-03 06:18:18 +00:00
Arthur Zucker	6a853c0c91	add model	2022-06-03 06:08:09 +00:00
Arthur Zucker	b45c935b8e	add model	2022-06-03 06:03:26 +00:00