kha-white/manga-ocr-base is a forked repo from huggingface. License: apache-2-0

Go to file

kha-white aa6573bd10 Fix dead link (#1 ) - Fix dead link (dae5dcb5cf328ca65281b360ed95099d76dcebdc) Co-authored-by: Abdulrazzaq Alhendi <aalhendi@users.noreply.huggingface.co>		2022-06-22 15:34:05 +00:00
.gitattributes	initial commit	2022-01-15 17:39:06 +00:00
README.md	Fix dead link (#1 )	2022-06-22 15:34:05 +00:00
config.json	manga-ocr-base	2022-01-15 20:18:35 +01:00
preprocessor_config.json	manga-ocr-base	2022-01-15 20:18:35 +01:00
pytorch_model.bin	manga-ocr-base	2022-01-15 20:18:35 +01:00
special_tokens_map.json	manga-ocr-base	2022-01-15 20:18:35 +01:00
tokenizer_config.json	manga-ocr-base	2022-01-15 20:18:35 +01:00
vocab.txt	manga-ocr-base	2022-01-15 20:18:35 +01:00

README.md

language

Manga OCR

Optical character recognition for Japanese text, with the main focus being Japanese manga.

It uses Vision Encoder Decoder framework.

Manga OCR can be used as a general purpose printed Japanese OCR, but its main goal was to provide a high quality text recognition, robust against various scenarios specific to manga:

both vertical and horizontal text
text with furigana
text overlaid on images
wide variety of fonts and font styles
low quality images

Code is available here.