Re: PDF Indexing / Search

I'd probably set up some type of conversion process (pdftotext) and then just grep those converted files.

the way too simplistic approach:

#!/usr/bin/perl

use strict;
use warnings;

die "usage: $0 <pdffile> <searchterm>"
  unless @ARGV == 2;

open( CMD, "/usr/bin/pdftotext $ARGV[0] - |" )
  || die "cannot open $ARGV[0]: $!";

while( <CMD> ) {
  print if( /$ARGV[1]/ );
}
[download]

-derby

Update: If I was going to do this for real and it needed to be web available, I would probably use Solr.

Comment on Re: PDF Indexing / Search Download Code