typo: encoder-encoder -> encoder-decoder (#1 )

- typo: encoder-encoder -> encoder-decoder (58509d68aee6f8131fbafbf0d4881c71cbe457d3) Co-authored-by: Daniel Levenson <dleve123@users.noreply.huggingface.co>
Update config.json
2022-06-03 10:00:20 +00:00 · 2022-03-09 16:01:15 +00:00 · 2021-09-16 09:55:32 +00:00 · 2021-06-14 07:44:06 +00:00 · 2021-03-09 17:02:30 -05:00 · 2020-12-11 22:39:34 +01:00
8 changed files with 100 additions and 11 deletions
--- a/.gitattributes
+++ b/.gitattributes
@ -6,3 +6,4 @@
 *.tar.gz filter=lfs diff=lfs merge=lfs -text
 *.ot filter=lfs diff=lfs merge=lfs -text
 *.onnx filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
--- a/README.md
+++ b/README.md
@ -0,0 +1,62 @@
+---
+license: apache-2.0
+language: en
+---
+
+# BART (large-sized model) 
+
+BART model pre-trained on English language. It was introduced in the paper [BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461) by Lewis et al. and first released in [this repository](https://github.com/pytorch/fairseq/tree/master/examples/bart). 
+
+Disclaimer: The team releasing BART did not write a model card for this model so this model card has been written by the Hugging Face team.
+
+## Model description
+
+BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.
+
+BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering).
+
+## Intended uses & limitations
+
+You can use the raw model for text infilling. However, the model is mostly meant to be fine-tuned on a supervised dataset. See the [model hub](https://huggingface.co/models?search=bart) to look for fine-tuned versions on a task that interests you.
+
+### How to use
+
+Here is how to use this model in PyTorch:
+
+```python
+from transformers import BartTokenizer, BartModel
+
+tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')
+model = BartModel.from_pretrained('facebook/bart-large')
+
+inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
+outputs = model(**inputs)
+
+last_hidden_states = outputs.last_hidden_state
+```
+
+### BibTeX entry and citation info
+
+```bibtex
+@article{DBLP:journals/corr/abs-1910-13461,
+  author    = {Mike Lewis and
+               Yinhan Liu and
+               Naman Goyal and
+               Marjan Ghazvininejad and
+               Abdelrahman Mohamed and
+               Omer Levy and
+               Veselin Stoyanov and
+               Luke Zettlemoyer},
+  title     = {{BART:} Denoising Sequence-to-Sequence Pre-training for Natural Language
+               Generation, Translation, and Comprehension},
+  journal   = {CoRR},
+  volume    = {abs/1910.13461},
+  year      = {2019},
+  url       = {http://arxiv.org/abs/1910.13461},
+  eprinttype = {arXiv},
+  eprint    = {1910.13461},
+  timestamp = {Thu, 31 Oct 2019 14:02:26 +0100},
+  biburl    = {https://dblp.org/rec/journals/corr/abs-1910-13461.bib},
+  bibsource = {dblp computer science bibliography, https://dblp.org}
+}
+```
--- a/config.json
+++ b/config.json
@ -1,16 +1,15 @@
 {
-  "_num_labels": 3,
-  "activation_dropout": 0.0,
+  "activation_dropout": 0.1,
  "activation_function": "gelu",
+  "add_bias_logits": false,
  "add_final_layer_norm": false,
  "architectures": [
-    "BartModel",
-    "BartForMaskedLM",
-    "BartForSequenceClassification"
+    "BartModel"
  ],
-  "attention_dropout": 0.0,
+  "attention_dropout": 0.1,
  "bos_token_id": 0,
-  "classif_dropout": 0.0,
+  "classif_dropout": 0.1,
+  "classifier_dropout": 0.0,
  "d_model": 1024,
  "decoder_attention_heads": 16,
  "decoder_ffn_dim": 4096,
@ -18,11 +17,15 @@
  "decoder_layers": 12,
  "decoder_start_token_id": 2,
  "dropout": 0.1,
+  "early_stopping": true,
  "encoder_attention_heads": 16,
  "encoder_ffn_dim": 4096,
  "encoder_layerdrop": 0.0,
  "encoder_layers": 12,
  "eos_token_id": 2,
+  "forced_eos_token_id": 2,
+  "forced_bos_token_id": 0,
+  "gradient_checkpointing": false,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
@ -37,21 +40,33 @@
  },
  "max_position_embeddings": 1024,
  "model_type": "bart",
+  "no_repeat_ngram_size": 3,
  "normalize_before": false,
+  "num_beams": 4,
  "num_hidden_layers": 12,
-  "output_past": false,
  "pad_token_id": 1,
-  "prefix": " ",
  "scale_embedding": false,
  "task_specific_params": {
    "summarization": {
-      "early_stopping": true,
+      "length_penalty": 1.0,
+      "max_length": 128,
+      "min_length": 12,
+      "num_beams": 4
+    },
+    "summarization_cnn": {
      "length_penalty": 2.0,
      "max_length": 142,
      "min_length": 56,
-      "no_repeat_ngram_size": 3,
      "num_beams": 4
+    },
+    "summarization_xsum": {
+      "length_penalty": 1.0,
+      "max_length": 62,
+      "min_length": 11,
+      "num_beams": 6
    }
  },
+  "transformers_version": "4.7.0.dev0",
+  "use_cache": true,
  "vocab_size": 50265
 }
--- a/flax_model.msgpack
+++ b/flax_model.msgpack
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
--- a/tf_model.h5
+++ b/tf_model.h5
--- a/tokenizer.json
+++ b/tokenizer.json
--- a/tokenizer_config.json
+++ b/tokenizer_config.json
@ -0,0 +1 @@
+{"model_max_length": 1024}
Author	SHA1	Message	Date
patrickvonplaten	cb48c1365b	typo: encoder-encoder -> encoder-decoder (#1 ) - typo: encoder-encoder -> encoder-decoder (58509d68aee6f8131fbafbf0d4881c71cbe457d3) Co-authored-by: Daniel Levenson <dleve123@users.noreply.huggingface.co>	2022-06-03 10:00:20 +00:00
Patrick von Platen	030bb1bda8	Update config.json	2022-03-09 16:01:15 +00:00
Niels Rogge	9e1698384a	Add model card	2021-09-16 09:55:32 +00:00
patil-suraj	76041a4d55	add flax model	2021-06-14 07:44:06 +00:00
Sylvain Gugger	22fa33834d	Move tokenizer.json from roberta-large	2021-03-09 17:02:30 -05:00
Julien Chaumond	b51e601345	Migrate model card from transformers-repo Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755 Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/facebook/bart-large/README.md	2020-12-11 22:39:34 +01:00
system	f8275ef803	Update config.json	2020-10-26 21:24:10 +00:00
system	6000698fa9	Update tf_model.h5	2020-10-15 17:42:18 +00:00
system	b47d1df81e	Update pytorch_model.bin	2020-09-10 15:28:18 +00:00
system	f15be2839d	Update tokenizer_config.json	2020-08-25 05:10:46 +00:00