update readme

This commit is contained in:
Maciej Budyś
2022-01-20 01:02:42 +01:00
parent 514fb7707c
commit d61db20340
13 changed files with 78 additions and 2 deletions

View File

@@ -1,4 +1,18 @@
OCR for Japanese manga
# Manga OCR
Optical character recognition for Japanese text, with the main focus being Japanese manga.
It uses a custom end-to-end model built with Transformers' [Vision Encoder Decoder](https://huggingface.co/docs/transformers/model_doc/visionencoderdecoder) framework.
Manga OCR can be used as a general purpose printed Japanese OCR, but its main goal was to provide a high quality
text recognition, robust against various scenarios specific to manga:
- both vertical and horizontal text
- text with furigana
- text overlaid on images
- wide variety of fonts and font styles
- low quality images
Unlike many OCR models, Manga OCR supports recognizing multi-line text in a single forward pass,
so that text bubbles found in manga can be processed at once, without splitting them into lines.
# Installation
@@ -9,12 +23,38 @@ otherwise this step can be skipped.
Run:
```
```commandline
pip install manga-ocr
```
# Usage
## Running in the background
Manga OCR can run in the background, processing new images as they appear.
You might then use a tool like [ShareX](https://getsharex.com/) to manually capture a region of the screen and let the
OCR read it either from the system clipboard, or a specified directory.
For example:
- To read images from clipboard and write recognized texts to clipboard, run:
```commandline
manga_ocr
```
- To read images from ShareX's screenshot folder, run:
```commandline
manga_ocr "/path/to/sharex/screenshot/folder"
```
- To see other options, run:
```commandline
manga_ocr --help
```
If `manga_ocr` doesn't work, you might also try replacing it with `python -m manga_ocr`.
## Python API
```python
from manga_ocr import MangaOcr
@@ -33,3 +73,39 @@ mocr = MangaOcr()
img = PIL.Image.open('/path/to/img')
text = mocr(img)
```
## Usage tips
- OCR supports multi-line text, but the longer the text, the more likely some errors are to occur.
If the recognition failed for some part of a longer text, you might try to run it on a smaller portion of the image.
- The model was trained specifically to handle manga well, but should do a decent job on other types of printed text,
such as novels or video games. It probably won't be able to handle handwritten text though.
- The model always attempts to recognize some text on the image, even if there is none.
Because it uses a transformer decoder (and therefore has some understanding of the Japanese language),
it might even "dream up" some realistically looking sentences! This shouldn't be a problem for most use cases,
but it might get improved in the next version.
# Examples
Here are some cherry-picked examples showing the capability of the model.
| image | Manga OCR result |
|----------------------|------------------|
| ![](examples/00.jpg) | 素直にあやまるしか |
| ![](examples/01.jpg) | 立川で見た〝穴〟の下の巨大な眼は: |
| ![](examples/02.jpg) | 実戦剣術も一流です |
| ![](examples/03.jpg) | 第30話重苦しい闇の奥で静かに呼吸づきながら |
| ![](examples/04.jpg) | よかったじゃないわよ!何逃げてるのよ!!早くあいつを退治してよ! |
| ![](examples/05.jpg) | ぎゃっ |
| ![](examples/06.jpg) | ピンポーーン |
| ![](examples/07.jpg) | LINK!私達7人の力でガノンの塔の結界をやぶります |
| ![](examples/08.jpg) | ファイアパンチ |
| ![](examples/09.jpg) | 少し黙っている |
| ![](examples/10.jpg) | わかるかな〜? |
| ![](examples/11.jpg) | 警察にも先生にも町中の人達に!! |
# Acknowledgments
This project was done with the usage of [Manga109-s](http://www.manga109.org/en/download_s.html) dataset.

BIN
examples/00.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

BIN
examples/01.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

BIN
examples/02.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.8 KiB

BIN
examples/03.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

BIN
examples/04.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

BIN
examples/05.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.8 KiB

BIN
examples/06.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.2 KiB

BIN
examples/07.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

BIN
examples/08.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.9 KiB

BIN
examples/09.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.2 KiB

BIN
examples/10.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.4 KiB

BIN
examples/11.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB