Compare commits

..

10 Commits

Author SHA1 Message Date
Joao Gante a0d0ef7fe2 Adding generation config file(s) 2023-01-24 16:29:06 +00:00
Lysandre c468b2376f Update tokenizer_config.json 2021-09-21 19:45:52 +00:00
Jeff Boudier d20330e908 Add conversational tag to README.md 2020-12-04 01:45:43 +00:00
Thomas Wolf b57f64c66c Update tokenizer configuration 2020-11-25 12:57:33 +01:00
system 8218a29866 Update README.md 2020-11-01 15:20:52 +00:00
system 4552a075dc Update README.md 2020-11-01 15:20:42 +00:00
system 567aa82429 Update README.md 2020-10-29 21:48:10 +00:00
system 0b5aa125dc Update config.json 2020-09-28 04:25:44 +00:00
system 1ca1d53c55 Update pytorch_model.bin 2020-09-28 04:24:34 +00:00
system 9fbc81fa3f Update README.md 2020-09-25 18:04:24 +00:00
5 changed files with 48 additions and 4 deletions

26
README.md Normal file
View File

@ -0,0 +1,26 @@
---
language:
- en
thumbnail:
tags:
- convAI
- conversational
- facebook
license: apache-2.0
datasets:
- blended_skill_talk
metrics:
- perplexity
---
## Model description
+ Paper: [Recipes for building an open-domain chatbot](https://arxiv.org/abs/1907.06616)
+ [Original PARLAI Code](https://parl.ai/projects/recipes/)
### Abstract
Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to their partners, both asking and answering questions, and displaying knowledge, empathy and personality appropriately, depending on the situation. We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter neural models, and make our models and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing failure cases of our models.

View File

@ -2,7 +2,7 @@
"activation_dropout": 0.0,
"activation_function": "gelu",
"add_bias_logits": false,
"add_final_layer_norm": false,
"add_final_layer_norm": true,
"architectures": [
"BlenderbotForConditionalGeneration"
],
@ -42,13 +42,13 @@
"model_type": "blenderbot",
"no_repeat_ngram_size": 3,
"normalize_before": true,
"normalize_embedding": true,
"normalize_embedding": false,
"num_beams": 10,
"num_hidden_layers": 2,
"pad_token_id": 0,
"scale_embedding": true,
"static_position_embeddings": false,
"unk_token_id": 3,
"variant": "prelayernorm",
"layernorm_variant": "prelayernorm",
"vocab_size": 8008
}

15
generation_config.json Normal file
View File

@ -0,0 +1,15 @@
{
"_from_model_config": true,
"bos_token_id": 1,
"decoder_start_token_id": 1,
"encoder_no_repeat_ngram_size": 3,
"eos_token_id": 2,
"forced_eos_token_id": 2,
"length_penalty": 0.65,
"max_length": 60,
"min_length": 20,
"no_repeat_ngram_size": 3,
"num_beams": 10,
"pad_token_id": 0,
"transformers_version": "4.27.0.dev0"
}

BIN
pytorch_model.bin (Stored with Git LFS) Normal file

Binary file not shown.

View File

@ -1 +1 @@
{"add_prefix_space": "true"}
{"add_prefix_space": true, "tokenizer_class_name": "BlenderbotTokenizer"}