kha-white/manga-ocr-base is a forked repo from huggingface. License: apache-2-0
Go to file
kha-white aa6573bd10 Fix dead link (#1)
- Fix dead link (dae5dcb5cf328ca65281b360ed95099d76dcebdc)


Co-authored-by: Abdulrazzaq Alhendi <aalhendi@users.noreply.huggingface.co>
2022-06-22 15:34:05 +00:00
.gitattributes initial commit 2022-01-15 17:39:06 +00:00
README.md Fix dead link (#1) 2022-06-22 15:34:05 +00:00
config.json manga-ocr-base 2022-01-15 20:18:35 +01:00
preprocessor_config.json manga-ocr-base 2022-01-15 20:18:35 +01:00
pytorch_model.bin manga-ocr-base 2022-01-15 20:18:35 +01:00
special_tokens_map.json manga-ocr-base 2022-01-15 20:18:35 +01:00
tokenizer_config.json manga-ocr-base 2022-01-15 20:18:35 +01:00
vocab.txt manga-ocr-base 2022-01-15 20:18:35 +01:00

README.md

language tags license datasets
ja
image-to-text
apache-2.0
manga109s

Manga OCR

Optical character recognition for Japanese text, with the main focus being Japanese manga.

It uses Vision Encoder Decoder framework.

Manga OCR can be used as a general purpose printed Japanese OCR, but its main goal was to provide a high quality text recognition, robust against various scenarios specific to manga:

  • both vertical and horizontal text
  • text with furigana
  • text overlaid on images
  • wide variety of fonts and font styles
  • low quality images

Code is available here.