Reading the documentation of PDF::OCR2, I get the impression that it converts the PDF pages into separate image files using PDF::GetImages and then uses Image::OCR::Tesseract to get the text from the image.
I would change that to add a cropping step in between, which selects only the "interesting" part of the image.
In reply to Re^3: PDF::OCR2 results not what I was hoping for
by Corion
in thread PDF::OCR2 results not what I was hoping for
by nysus
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |