bart-large

facebook/bart-large is a forked repo from huggingface. License: apache-2-0

Go to file

patil-suraj 76041a4d55 add flax model		2021-06-14 07:44:06 +00:00
.gitattributes	add flax model	2021-06-14 07:44:06 +00:00
README.md	Migrate model card from transformers-repo	2020-12-11 22:39:34 +01:00
config.json	add flax model	2021-06-14 07:44:06 +00:00
flax_model.msgpack	add flax model	2021-06-14 07:44:06 +00:00
merges.txt	Update merges.txt	2020-08-25 05:10:45 +00:00
pytorch_model.bin	Update pytorch_model.bin	2020-09-10 15:28:18 +00:00
rust_model.ot	Update rust_model.ot	2020-04-25 15:33:01 +00:00
tf_model.h5	Update tf_model.h5	2020-10-15 17:42:18 +00:00
tokenizer.json	Move tokenizer.json from roberta-large	2021-03-09 17:02:30 -05:00
tokenizer_config.json	Update tokenizer_config.json	2020-08-25 05:10:46 +00:00
vocab.json	Update vocab.json	2020-08-25 05:10:46 +00:00

README.md

license	thumbnail
mit	https://huggingface.co/front/thumbnails/facebook.png

The Bart model was proposed by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019. According to the abstract,

Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT).

The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, where spans of text are replaced with a single mask token.

BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE.

The Authors’ code can be found here: https://github.com/pytorch/fairseq/tree/master/examples/bart

README.md Unescape Escape

README.md