in reply to Image Character Recognition

Funny I was actually thinking about doing something like this the other day. I will write down what I think is a strategy to do 'intelligent' character recognition. I am quite confident that this technique is able to recognise printed characters in the noisy-ish images generated by sites like PayPal.

Core Component
Multi-layer, back propagating neural network. I will probably use the AI::NeuralNet::BackProp module to do this. The neural-net is then pre-trained with the fonts to be recognised.

Image Processing
You will definitely need to clean up the image somehow before feeding into the character recognition engine. The pre-processing would involve:
  • color image -> black/white convertion (to simplify the recognition)
  • noise reduction, including lines that run across the image
  • a pixel density count, statistics collection, determine text/character boundary

    Character Recognition Process
  • Input is an array of character bitmaps captured by the pre-processing steps
  • Feed the bitmap into the neural network, and get the best estimate of the character it contains
  • Output the characters recognised