I'd probably set up some type of conversion process (pdftotext) and then just grep those converted files.
the way too simplistic approach:
#!/usr/bin/perl use strict; use warnings; die "usage: $0 <pdffile> <searchterm>" unless @ARGV == 2; open( CMD, "/usr/bin/pdftotext $ARGV[0] - |" ) || die "cannot open $ARGV[0]: $!"; while( <CMD> ) { print if( /$ARGV[1]/ ); }
Update: If I was going to do this for real and it needed to be web available, I would probably use Solr.
In reply to Re: PDF Indexing / Search
by derby
in thread PDF Indexing / Search
by Trihedralguy
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |