in reply to Re^2: Perl variant of linux tool strings
in thread Perl variant of linux tool strings
For collecting words from pdf documents, you can use the ps2ascii utility which comes with ghostscript. It executes the document with ghostscript, using a special device that outputs only ascii text. As ghostscript can handle pdfs too, ps2ascii works fine on them (although I did have some compatibility problems with some pdfs, depending on the generating program and the version of ghostscript).
This doesn't work for word documents of course.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4 perl variant of linux tool 'strings'
by Random_Walk (Prior) on Mar 23, 2005 at 21:25 UTC | |
by ambrus (Abbot) on Mar 23, 2005 at 21:30 UTC | |
by Joost (Canon) on Mar 23, 2005 at 21:51 UTC | |
by jeanluca (Deacon) on Mar 24, 2005 at 07:23 UTC |