This is longformer-base-4096 model fine-tuned on SQuAD v1 dataset for question answering task.
[Longformer](https://arxiv.org/abs/2004.05150) model created by Iz Beltagy, Matthew E. Peters, Arman Coha from AllenAI. As the paper explains it
> `Longformer` is a BERT-like model for long documents.
The pre-trained model can handle sequences with upto 4096 tokens.
## Model Training
This model was trained on google colab v100 GPU. You can find the fine-tuning colab here [](https://colab.research.google.com/drive/1zEl5D-DdkBKva-DdreVOmN0hrAfzKG1o?usp=sharing).
Few things to keep in mind while training longformer for QA task,
by default longformer uses sliding-window local attention on all tokens. But For QA, all question tokens should have global attention. For more details on this please refer the paper. The `LongformerForQuestionAnswering` model automatically does that for you. To allow it to do that
1. The input sequence must have three sep tokens, i.e the sequence should be encoded like this
` <s> question</s></s> context</s>`. If you encode the question and answer as a input pair, then the tokenizer already takes care of that, you shouldn't worry about it.
2.`input_ids` should always be a batch of examples.
## Results
|Metric | # Value |
|-------------|---------|
| Exact Match | 85.1466 |
| F1 | 91.5415 |
## Model in Action 🚀
```python
import torch
from transformers import AutoTokenizer, AutoModelForQuestionAnswering,
The `LongformerForQuestionAnswering` isn't yet supported in `pipeline` . I'll update this card once the support has been added.
> Created with ❤️ by Suraj Patil [](https://github.com/patil-suraj/)