typeform/distilbert-base-uncased-mnli is a forked repo from huggingface. License: None
Go to file
David Chu 3a1c032540 Update README.md 2021-05-27 09:53:20 +00:00
.gitattributes initial commit 2021-02-13 10:51:22 +00:00
README.md Update README.md 2021-05-27 09:53:20 +00:00
config.json Update config.json 2021-02-13 18:34:54 +00:00
eval_results_mnli-mm.txt add model 2021-02-13 14:58:05 +00:00
eval_results_mnli.txt add model 2021-02-13 14:58:05 +00:00
pytorch_model.bin add model 2021-02-13 14:58:05 +00:00
special_tokens_map.json add model 2021-02-13 14:58:05 +00:00
tokenizer_config.json add model 2021-02-13 14:58:05 +00:00
train_results.txt add model 2021-02-13 14:58:05 +00:00
trainer_state.json add model 2021-02-13 14:58:05 +00:00
training_args.bin add model 2021-02-13 14:58:05 +00:00
vocab.txt add model 2021-02-13 14:58:05 +00:00

README.md

language pipeline_tag tags datasets metrics
en zero-shot-classification
distilbert
multi_nli
accuracy

DistilBERT base model (uncased)

This is the uncased DistilBERT model fine-tuned on Multi-Genre Natural Language Inference (MNLI) dataset for the zero-shot classification task. The model is not case-sensitive, i.e., it does not make a difference between "english" and "English".

Training

Training is done on a p3.2xlarge AWS EC2 instance (1 NVIDIA Tesla V100 GPUs), with the following hyperparameters:

$ run_glue.py \
    --model_name_or_path distilbert-base-uncased \
    --task_name mnli \
    --do_train \
    --do_eval \
    --max_seq_length 128 \
    --per_device_train_batch_size 16 \
    --learning_rate 2e-5 \
    --num_train_epochs 5 \
    --output_dir /tmp/distilbert-base-uncased_mnli/

Evaluation results

Task MNLI MNLI-mm
82.0 82.0