Addition of Rust model

metadata: add license
remove .ipynb
2021-02-10 17:35:40 +01:00 · 2021-02-10 07:45:12 +00:00 · 2021-02-10 07:14:20 +00:00 · 2021-02-10 07:13:43 +00:00 · 2020-12-11 23:03:37 +01:00 · 2020-08-07 11:24:23 +00:00
8 changed files with 50096 additions and 0 deletions
--- a/24
+++ b/24
@ -0,0 +1,24 @@
+MIT License
+-----------
+
+Copyright (c) 2020 Suraj Patil (https://huggingface.co/valhalla/longformer-base-4096-finetuned-squadv1)
+Permission is hereby granted, free of charge, to any person
+obtaining a copy of this software and associated documentation
+files (the "Software"), to deal in the Software without
+restriction, including without limitation the rights to use,
+copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the
+Software is furnished to do so, subject to the following
+conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
+OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
+HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
+WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+OTHER DEALINGS IN THE SOFTWARE
--- a/README.md
+++ b/README.md
@ -0,0 +1,60 @@
+---
+datasets:
+- squad_v1
+license: mit
+---
+
+# LONGFORMER-BASE-4096 fine-tuned on SQuAD v1
+This is longformer-base-4096 model fine-tuned on SQuAD v1 dataset for question answering task. 
+
+[Longformer](https://arxiv.org/abs/2004.05150) model  created by Iz Beltagy, Matthew E. Peters, Arman Coha from AllenAI.  As the paper explains it 
+
+> `Longformer` is a BERT-like model for long documents. 
+
+The pre-trained model can handle sequences with upto 4096 tokens. 
+
+
+## Model Training
+This model was trained on google colab v100 GPU. You can find the fine-tuning colab here [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1zEl5D-DdkBKva-DdreVOmN0hrAfzKG1o?usp=sharing).
+
+Few things to keep in mind while training longformer for QA task,
+by default longformer uses sliding-window local attention on all tokens. But For QA, all question tokens should  have global attention. For more details on this please refer the paper. The `LongformerForQuestionAnswering` model automatically does that for you. To allow it to do that 
+1. The input sequence must have three sep tokens, i.e the sequence should be encoded like this
+   ` <s> question</s></s> context</s>`.  If you encode the question and answer as a input  pair, then the tokenizer already takes care of that, you shouldn't worry about it.
+2. `input_ids` should always be a batch of examples. 
+
+## Results
+|Metric       | # Value |
+|-------------|---------|
+| Exact Match | 85.1466 |
+| F1          | 91.5415 |
+
+## Model in Action  🚀
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForQuestionAnswering,
+
+tokenizer = AutoTokenizer.from_pretrained("valhalla/longformer-base-4096-finetuned-squadv1")
+model = AutoModelForQuestionAnswering.from_pretrained("valhalla/longformer-base-4096-finetuned-squadv1")
+
+text = "Huggingface has democratized NLP. Huge thanks to Huggingface for this."
+question = "What has Huggingface done ?"
+encoding = tokenizer(question, text, return_tensors="pt")
+input_ids = encoding["input_ids"]
+
+# default is local attention everywhere
+# the forward method will automatically set global attention on question tokens
+attention_mask = encoding["attention_mask"]
+
+start_scores, end_scores = model(input_ids, attention_mask=attention_mask)
+all_tokens = tokenizer.convert_ids_to_tokens(input_ids[0].tolist())
+
+answer_tokens = all_tokens[torch.argmax(start_scores) :torch.argmax(end_scores)+1]
+answer = tokenizer.decode(tokenizer.convert_tokens_to_ids(answer_tokens))
+# output => democratized NLP
+```
+
+The `LongformerForQuestionAnswering` isn't yet supported in `pipeline` . I'll update this card once the support has been added.
+
+> Created with ❤️ by Suraj Patil [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/patil-suraj/)
+[![Twitter icon](https://cdn0.iconfinder.com/data/icons/shift-logotypes/32/Twitter-32.png)](https://twitter.com/psuraj28)
--- a/config.json
+++ b/config.json
@ -20,6 +20,7 @@
  ],
  "bos_token_id": 0,
  "eos_token_id": 2,
+  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
--- a/merges.txt
+++ b/merges.txt
--- a/pytorch_model.bin
+++ b/pytorch_model.bin
--- a/rust_model.ot
+++ b/rust_model.ot
--- a/special_tokens_map.json
+++ b/special_tokens_map.json
@ -0,0 +1 @@
+{"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": "<mask>"}
--- a/tf_model.h5
+++ b/tf_model.h5
Author	SHA1	Message	Date
Guillaume B	159b620576	Addition of Rust model	2021-02-10 17:35:40 +01:00
Julien Chaumond	1ad74ed178	metadata: add license	2021-02-10 07:45:12 +00:00
patil-suraj	4c9ee04765	remove .ipynb	2021-02-10 07:14:20 +00:00
patil-suraj	af9e1eb637	add license	2021-02-10 07:13:43 +00:00
Julien Chaumond	f08f52d924	Migrate model card from transformers-repo Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755 Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/valhalla/longformer-base-4096-finetuned-squadv1/README.md	2020-12-11 23:03:37 +01:00
system	156076fd73	Update config.json	2020-08-07 11:24:23 +00:00
system	32232adffe	Update tf_model.h5	2020-08-07 11:24:23 +00:00
system	00fb022966	Update special_tokens_map.json	2020-05-27 05:05:18 +00:00
system	db3a8725c0	Update merges.txt	2020-05-27 05:05:17 +00:00
system	f869dbd247	Update pytorch_model.bin	2020-05-27 05:05:06 +00:00
				`@ -0,0 +1 @@`
				`{"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": "<mask>"}`