Adding generation config file(s)

Update tokenizer_config.json
Add conversational tag to README.md
2023-01-24 16:29:06 +00:00 · 2021-09-21 19:45:52 +00:00 · 2020-12-04 01:45:43 +00:00 · 2020-11-25 12:57:33 +01:00 · 2020-11-01 15:20:52 +00:00 · 2020-11-01 15:20:42 +00:00
5 changed files with 48 additions and 4 deletions
--- a/README.md
+++ b/README.md
@ -0,0 +1,26 @@
+---
+language: 
+- en
+thumbnail:
+tags:
+- convAI
+- conversational
+- facebook
+license: apache-2.0
+datasets:
+- blended_skill_talk
+metrics:
+- perplexity
+---
+
+## Model description
+
+ Paper: [Recipes for building an open-domain chatbot](https://arxiv.org/abs/1907.06616)
+ [Original PARLAI Code](https://parl.ai/projects/recipes/)
+
+
+### Abstract
+
+
+Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to their partners, both asking and answering questions, and displaying knowledge, empathy and personality appropriately, depending on the situation. We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter neural models, and make our models and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing failure cases of our models.
+
--- a/config.json
+++ b/config.json
@ -2,7 +2,7 @@
  "activation_dropout": 0.0,
  "activation_function": "gelu",
  "add_bias_logits": false,
-  "add_final_layer_norm": false,
+  "add_final_layer_norm": true,
  "architectures": [
    "BlenderbotForConditionalGeneration"
  ],
@ -42,13 +42,13 @@
  "model_type": "blenderbot",
  "no_repeat_ngram_size": 3,
  "normalize_before": true,
-  "normalize_embedding": true,
+  "normalize_embedding": false,
  "num_beams": 10,
  "num_hidden_layers": 2,
  "pad_token_id": 0,
  "scale_embedding": true,
  "static_position_embeddings": false,
  "unk_token_id": 3,
-  "variant": "prelayernorm",
+  "layernorm_variant": "prelayernorm",
  "vocab_size": 8008
 }
--- a/generation_config.json
+++ b/generation_config.json
@ -0,0 +1,15 @@
+{
+  "_from_model_config": true,
+  "bos_token_id": 1,
+  "decoder_start_token_id": 1,
+  "encoder_no_repeat_ngram_size": 3,
+  "eos_token_id": 2,
+  "forced_eos_token_id": 2,
+  "length_penalty": 0.65,
+  "max_length": 60,
+  "min_length": 20,
+  "no_repeat_ngram_size": 3,
+  "num_beams": 10,
+  "pad_token_id": 0,
+  "transformers_version": "4.27.0.dev0"
+}
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@ -1 +1 @@
-{"add_prefix_space": "true"}
+{"add_prefix_space": true, "tokenizer_class_name": "BlenderbotTokenizer"}
Author	SHA1	Message	Date
Joao Gante	a0d0ef7fe2	Adding generation config file(s)	2023-01-24 16:29:06 +00:00
Lysandre	c468b2376f	Update tokenizer_config.json	2021-09-21 19:45:52 +00:00
Jeff Boudier	d20330e908	Add conversational tag to README.md	2020-12-04 01:45:43 +00:00
Thomas Wolf	b57f64c66c	Update tokenizer configuration	2020-11-25 12:57:33 +01:00
system	8218a29866	Update README.md	2020-11-01 15:20:52 +00:00
system	4552a075dc	Update README.md	2020-11-01 15:20:42 +00:00
system	567aa82429	Update README.md	2020-10-29 21:48:10 +00:00
system	0b5aa125dc	Update config.json	2020-09-28 04:25:44 +00:00
system	1ca1d53c55	Update pytorch_model.bin	2020-09-28 04:24:34 +00:00
system	9fbc81fa3f	Update README.md	2020-09-25 18:04:24 +00:00