facebook/bart-large-mnli is a forked repo from huggingface. License: mit

mit model text-classification zero-shot-classification

Go to file

Lysandre 9fc9c4e180 Adding `safetensors` variant of this model (#5 ) - Adding `safetensors` variant of this model (ce49c685ce050ea4a821cfdb1418eceacf39e88d) Co-authored-by: Nicolas Patry <Narsil@users.noreply.huggingface.co>		2022-11-16 15:16:24 +00:00
.gitattributes	Adding `safetensors` variant of this model (#5 )	2022-11-16 15:16:24 +00:00
README.md	Update README.md	2021-08-09 08:25:07 +00:00
config.json	add flax model	2021-06-14 07:34:28 +00:00
flax_model.msgpack	add flax model	2021-06-14 07:34:28 +00:00
merges.txt	Update merges.txt	2020-08-25 05:10:55 +00:00
model.safetensors	Adding `safetensors` variant of this model (#5 )	2022-11-16 15:16:24 +00:00
pytorch_model.bin	Update pytorch_model.bin	2020-02-12 19:54:01 +00:00
rust_model.ot	Update rust_model.ot	2020-09-01 15:08:19 +00:00
tokenizer.json	Move tokenizer.json from roberta-large	2021-03-09 17:04:32 -05:00
tokenizer_config.json	Update tokenizer_config.json	2020-08-25 05:10:55 +00:00
vocab.json	Update vocab.json	2020-08-25 05:10:56 +00:00

README.md

license

thumbnail

pipeline_tag

datasets

mit

https://huggingface.co/front/thumbnails/facebook.png

zero-shot-classification

multi_nli

bart-large-mnli

This is the checkpoint for bart-large after being trained on the MultiNLI (MNLI) dataset.

Additional information about this model:

NLI-based Zero Shot Text Classification

Yin et al. proposed a method for using pre-trained NLI models as a ready-made zero-shot sequence classifiers. The method works by posing the sequence to be classified as the NLI premise and to construct a hypothesis from each candidate label. For example, if we want to evaluate whether a sequence belongs to the class "politics", we could construct a hypothesis of This text is about politics.. The probabilities for entailment and contradiction are then converted to label probabilities.

This method is surprisingly effective in many cases, particularly when used with larger pre-trained models like BART and Roberta. See this blog post for a more expansive introduction to this and other zero shot methods, and see the code snippets below for examples of using this model for zero-shot classification both with Hugging Face's built-in pipeline and with native Transformers/PyTorch code.

With the zero-shot classification pipeline

The model can be loaded with the zero-shot-classification pipeline like so:

from transformers import pipeline
classifier = pipeline("zero-shot-classification",
                      model="facebook/bart-large-mnli")

You can then use this pipeline to classify sequences into any of the class names you specify.

sequence_to_classify = "one day I will see the world"
candidate_labels = ['travel', 'cooking', 'dancing']
classifier(sequence_to_classify, candidate_labels)
#{'labels': ['travel', 'dancing', 'cooking'],
# 'scores': [0.9938651323318481, 0.0032737774308770895, 0.002861034357920289],
# 'sequence': 'one day I will see the world'}

If more than one candidate label can be correct, pass multi_class=True to calculate each class independently:

candidate_labels = ['travel', 'cooking', 'dancing', 'exploration']
classifier(sequence_to_classify, candidate_labels, multi_class=True)
#{'labels': ['travel', 'exploration', 'dancing', 'cooking'],
# 'scores': [0.9945111274719238,
#  0.9383890628814697,
#  0.0057061901316046715,
#  0.0018193122232332826],
# 'sequence': 'one day I will see the world'}

With manual PyTorch

# pose sequence as a NLI premise and label as a hypothesis
from transformers import AutoModelForSequenceClassification, AutoTokenizer
nli_model = AutoModelForSequenceClassification.from_pretrained('facebook/bart-large-mnli')
tokenizer = AutoTokenizer.from_pretrained('facebook/bart-large-mnli')

premise = sequence
hypothesis = f'This example is {label}.'

# run through model pre-trained on MNLI
x = tokenizer.encode(premise, hypothesis, return_tensors='pt',
                     truncation_strategy='only_first')
logits = nli_model(x.to(device))[0]

# we throw away "neutral" (dim 1) and take the probability of
# "entailment" (2) as the probability of the label being true 
entail_contradiction_logits = logits[:,[0,2]]
probs = entail_contradiction_logits.softmax(dim=1)
prob_label_is_true = probs[:,1]