comment on

I've got something a million times better and faster. A lot of these scanners will turn a scan to a pdf first. I use that to scan as much as i want to one document to send somewhere. That way I dont have to name and save each scan.

Then I am using pdftohtml to extract images. And imagemagick to do any transforms etc. The set of cli tools for imagemagick like convert are awesome.

Check out what I do.. (this is bash)

#!/bin/sh
rm -rf ./tmp; # clean previous run

# this assumes that the checks are scanned upright, bottom towards lef
+t margin on pdf, one per page
# this assumes you have scans in ./scans
# in pdf format each may have multiple checks, one per page.

mkdir ./tmp
mkdir ./tmp/scans; # incoming pdf with multiple checks
mkdir ./tmp/pngs; # documents for storage prepped

# copy incoming to staging area
cp ./scans/*pdf ./tmp/scans/;

# rip images and resize , crop etc
find ./tmp/scans/ -iname "*pdf" -exec pdftohtml -q -c -zoom 10 '{}' \;
mv ./tmp/scans/*png ./tmp/pngs/;
rm -rf ./tmp/scans
# format the pngs for storing
mogrify -rotate -90 -chop 0x450 ./tmp/pngs/*png;

# Basically all this does is take the pdfs
# and put images in tmp/pngs that are sized properly and right side up
+.

# STEP 2
# get images from pngs and turn into micr strip files
# ready for gocr

# copy to micr
rm -rf  ./tmp/micr;
mkdir ./tmp/micr;
cp ./tmp/pngs/*png ./tmp/micr/;

mogrify -chop 0x660 ./tmp/micr/*png;

# convert the isolated micr for gocr
find ./tmp/micr/*png -exec convert '{}' '{}'.pbm \;
rename png.pbm pbm ./tmp/micr/*pbm
rm ./tmp/micr/*png
[download]

Then I use perl and gocr to rename the files, sort, archive, whatever.. etc etc...

In reply to Re^2: (OT) MICR check scanning by leocharre
in thread (OT) MICR check scanning by leocharre

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.