Compare commits

..

10 Commits

Author SHA1 Message Date
Joao Gante 9d5c5fadcc Adding generation config file(s) 2023-01-24 16:55:09 +00:00
Patrick von Platen 8bada3b953 upload flax model 2021-05-23 09:11:45 +00:00
Patrick von Platen bb398736a4 allow flax 2021-05-23 09:11:06 +00:00
Lysandre e84a3e0adc Create tokenizer_config.json 2021-02-23 15:17:21 +00:00
Julien Chaumond adda56196f Migrate model card from transformers-repo
Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/microsoft/DialoGPT-medium/README.md
2020-12-11 22:50:26 +01:00
system 7cb5298862 Update config.json 2020-10-14 11:57:08 +00:00
system 8e2e228516 Update rust_model.ot 2020-06-28 11:05:41 +00:00
system a901f37c2a Update config.json 2020-04-24 16:02:37 +00:00
system b86761883d Update tf_model.h5 2020-04-11 18:00:20 +00:00
system 451955c798 Update config.json 2020-03-23 21:50:29 +00:00
9 changed files with 107 additions and 18 deletions

1
.gitattributes vendored
View File

@ -6,3 +6,4 @@
*.tar.gz filter=lfs diff=lfs merge=lfs -text *.tar.gz filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text *.ot filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text *.onnx filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text

54
README.md Normal file
View File

@ -0,0 +1,54 @@
---
thumbnail: https://huggingface.co/front/thumbnails/dialogpt.png
tags:
- conversational
license: mit
---
## A State-of-the-Art Large-scale Pretrained Response generation model (DialoGPT)
DialoGPT is a SOTA large-scale pretrained dialogue response generation model for multiturn conversations.
The [human evaluation results](https://github.com/dreasysnail/Dialogpt_dev#human-evaluation) indicate that the response generated from DialoGPT is comparable to human response quality under a single-turn conversation Turing test.
The model is trained on 147M multi-turn dialogue from Reddit discussion thread.
* Multi-turn generation examples from an interactive environment:
|Role | Response |
|---------|--------|
|User | Does money buy happiness? |
| Bot | Depends how much money you spend on it .|
|User | What is the best way to buy happiness ? |
| Bot | You just have to be a millionaire by your early 20s, then you can be happy . |
|User |This is so difficult ! |
| Bot | You have no idea how hard it is to be a millionaire and happy . There is a reason the rich have a lot of money |
Please find the information about preprocessing, training and full details of the DialoGPT in the [original DialoGPT repository](https://github.com/microsoft/DialoGPT)
ArXiv paper: [https://arxiv.org/abs/1911.00536](https://arxiv.org/abs/1911.00536)
### How to use
Now we are ready to try out how the model works as a chatting partner!
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")
# Let's chat for 5 lines
for step in range(5):
# encode the new user input, add the eos_token and return a tensor in Pytorch
new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt')
# append the new user input tokens to the chat history
bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids
# generated a response while limiting the total chat history to 1000 tokens,
chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
# pretty print last ouput tokens from bot
print("DialoGPT: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
```

View File

@ -1,11 +1,12 @@
{ {
"activation_function": "gelu_new",
"architectures": [ "architectures": [
"GPT2LMHeadModel" "GPT2LMHeadModel"
], ],
"attn_pdrop": 0.1,
"bos_token_id": 50256, "bos_token_id": 50256,
"eos_token_ids": [ "embd_pdrop": 0.1,
50256 "eos_token_id": 50256,
],
"initializer_range": 0.02, "initializer_range": 0.02,
"layer_norm_epsilon": 1e-05, "layer_norm_epsilon": 1e-05,
"model_type": "gpt2", "model_type": "gpt2",
@ -14,6 +15,16 @@
"n_head": 16, "n_head": 16,
"n_layer": 24, "n_layer": 24,
"n_positions": 1024, "n_positions": 1024,
"pad_token_id": 50256, "resid_pdrop": 0.1,
"summary_activation": null,
"summary_first_dropout": 0.1,
"summary_proj_to_labels": true,
"summary_type": "cls_index",
"summary_use_proj": true,
"task_specific_params": {
"conversational": {
"max_length": 1000
}
},
"vocab_size": 50257 "vocab_size": 50257
} }

BIN
flax_model.msgpack (Stored with Git LFS) Normal file

Binary file not shown.

6
generation_config.json Normal file
View File

@ -0,0 +1,6 @@
{
"_from_model_config": true,
"bos_token_id": 50256,
"eos_token_id": 50256,
"transformers_version": "4.27.0.dev0"
}

View File

@ -0,0 +1,7 @@
{
"_from_model_config": true,
"bos_token_id": 50256,
"eos_token_id": 50256,
"max_length": 1000,
"transformers_version": "4.27.0.dev0"
}

BIN
rust_model.ot (Stored with Git LFS) Normal file

Binary file not shown.

BIN
tf_model.h5 (Stored with Git LFS) Normal file

Binary file not shown.

1
tokenizer_config.json Normal file
View File

@ -0,0 +1 @@
{"model_max_length": 1024}