impira/layoutlm-invoices is a forked repo from huggingface. License: cc-by-nc-sa-4-0
Go to file
Ankur Goyal 1a67f8ef3f Initial commit 2022-09-06 10:50:42 -07:00
.gitattributes initial commit 2022-09-06 17:49:13 +00:00
README.md Initial commit 2022-09-06 10:50:42 -07:00
config.json Initial commit 2022-09-06 10:50:42 -07:00
demo.png Initial commit 2022-09-06 10:50:42 -07:00
merges.txt Initial commit 2022-09-06 10:50:42 -07:00
pyproject.toml Initial commit 2022-09-06 10:50:42 -07:00
pytorch_model.bin Initial commit 2022-09-06 10:50:42 -07:00
setup.cfg Initial commit 2022-09-06 10:50:42 -07:00
special_tokens_map.json Initial commit 2022-09-06 10:50:42 -07:00
tokenizer.json Initial commit 2022-09-06 10:50:42 -07:00
tokenizer_config.json Initial commit 2022-09-06 10:50:42 -07:00
vocab.json Initial commit 2022-09-06 10:50:42 -07:00

README.md

language license tags
en cc-by-nc-sa-4.0
layoutlm
document-question-answering
pdf
invoices

LayoutLM for Invoices

This is a fine-tuned version of the multi-modal LayoutLM model for the task of question answering on invoices and other documents. It has been fine-tuned on a proprietary dataset of invoices as well as both SQuAD2.0 and DocVQA for general comprehension.

Non-consecutive tokens

Unlike other QA models, which can only extract consecutive tokens (because they predict the start and end of a sequence), this model can predict longer-range, non-consecutive sequences with an additional classifier head. For example, it can extract the two-line address as below:

Two-line Address

Getting started with the model

The best way to use this model is via DocQuery.

About us

This model was created by the team at Impira.