in reply to (OT) MICR check scanning

I don't know anything about MICR, but recently gscan2pdf was released and it works, producing pdf scans. It's Perl/Gtk2, so you could probably manipulate the pdf's after scanning, with some pdf module.

I'm not really a human, but I play one on earth. Cogito ergo sum a bum

Replies are listed 'Best First'.
Re^2: (OT) MICR check scanning
by leocharre (Priest) on Nov 03, 2006 at 16:58 UTC

    I've got something a million times better and faster. A lot of these scanners will turn a scan to a pdf first. I use that to scan as much as i want to one document to send somewhere. That way I dont have to name and save each scan.

    Then I am using pdftohtml to extract images. And imagemagick to do any transforms etc. The set of cli tools for imagemagick like convert are awesome.

    Check out what I do.. (this is bash)

    #!/bin/sh rm -rf ./tmp; # clean previous run # this assumes that the checks are scanned upright, bottom towards lef +t margin on pdf, one per page # this assumes you have scans in ./scans # in pdf format each may have multiple checks, one per page. mkdir ./tmp mkdir ./tmp/scans; # incoming pdf with multiple checks mkdir ./tmp/pngs; # documents for storage prepped # copy incoming to staging area cp ./scans/*pdf ./tmp/scans/; # rip images and resize , crop etc find ./tmp/scans/ -iname "*pdf" -exec pdftohtml -q -c -zoom 10 '{}' \; mv ./tmp/scans/*png ./tmp/pngs/; rm -rf ./tmp/scans # format the pngs for storing mogrify -rotate -90 -chop 0x450 ./tmp/pngs/*png; # Basically all this does is take the pdfs # and put images in tmp/pngs that are sized properly and right side up +. # STEP 2 # get images from pngs and turn into micr strip files # ready for gocr # copy to micr rm -rf ./tmp/micr; mkdir ./tmp/micr; cp ./tmp/pngs/*png ./tmp/micr/; mogrify -chop 0x660 ./tmp/micr/*png; # convert the isolated micr for gocr find ./tmp/micr/*png -exec convert '{}' '{}'.pbm \; rename png.pbm pbm ./tmp/micr/*pbm rm ./tmp/micr/*png

    Then I use perl and gocr to rename the files, sort, archive, whatever.. etc etc...