Cappadonna3030 has asked for the wisdom of the Perl Monks concerning the following question:

Hello:
I am attempting to build a reverse indexed library using mysql. Its based on the code from the CGI Programming book by O'Reilly.

The script is listed below:

 

#!/usr/bin/perl -w use strict; use DBI; use File::Find; use Fcntl; use Getopt::Long; use Text::English; use constant TYPE_DEFAULT => "article"; my (%opts, @files, $stop_words, $type); #User input GetOptions( \%opts, "dir=s", "cache=s", "stop=s", "ignore", "type=s", "numbers", "stem"); die usage() unless $opts{dir} && -d $opts{dir}; $opts{'type'} ||= TYPE_DEFAULT; #Get file names and build an array of files. find(sub{push @files, $File::Find::name}, $opts{dir}); $stop_words = load_stopwords($opts{stop}) if $opts{stop}; process_files(\@files, \%opts, $stop_words); sub load_stopwords { my $file = shift; my $words = {}; local *INFO, $_ or die "Can't open stop file: $file\n" unless -e $file; open INFO, $file or die "$!\n"; while(<INFO>) { next if /^#/; $words->{lc $1} = 1 if /(\S+)/; } close INFO; return $words; } sub process_files { #input variables: my($files, $opts, $stop_words) = @_; local( *FILE, $_ ); local $/ = "\n\n"; my $type = $opts{type}; my $dir = $opts{dir}; my %index; #Establish database variables: my($dbh, $sth1, $sth2); local(*FILE); local $/ = "\n\n"; my $file_id = 0; # initializing counter variable #Establish Database Connection $dbh = ("DBI:mysql:host=localhost; database=blah", "blah", "blah", {PrintError => 0, RaiseError=>1}); for ( my $file_id = 0; $file_id < @$files; $file_id++ ) { my $file = $files[$file_id]; my %seen_in_file; next unless -T $file; #print STDERR "Indexing $file\n"; #$index->{"!FILE_NAME:$file_id"} = $file; #Step 1: Create Library of Files: $sth1 = $dbh-> prepare("insert into library va +lues ($file, $dir, $type)"); $sth1-> execute(); open FILE, $file or die "Cannot open file: $file!\n"; while ( <FILE>

) { tr/A-Z/a-z/ if $opts{ignore}; s/<.+?>//gs; # Note this doesn't handle < or > i +n comments or js while ( /([a-z\d]{2,})\b/gi ) { my $word = $1; next if $stop_words->{lc $word}; next if $word =~ /^\d+$/ && not $opts{number +}; ( $word ) = Text::English::stem( $word ) if $o +pts{stem}; $index->{$word} = ( exists $index->{$word} ? "$index->{$word}:" : "" ) . "$file_id" unl +ess $seen_in_file{$word}++; } #New Flava: Take Contents out of hash Table and into D +B foreach my $words (keys(%index)) { $sth2 = $dbh-> prepare('insert into catalog values ($words, $index{$words})'); $sth2 -> execute(); } } } sub usage { my $usage = <<End_of_Usage; Usage: $0 -dir directory [options] The options are: -ignore Case-insensitive index -stop Path to common words file -stem Stem words -type File Type-- either email or article End_of_Usage return $usage; }

The main problem is that I keep getting these errors:

 

Useless use of a constant in void context at libbuilder.pl line 77. Useless use of a constant in void context at libbuilder.pl line 77. Useless use of a constant in void context at libbuilder.pl line 77. Global symbol "$index" requires explicit package name at libbuilder.pl + line 109. Global symbol "$index" requires explicit package name at libbuilder.pl + line 109. Global symbol "$index" requires explicit package name at libbuilder.pl + line 110. Missing right curly or square bracket at libbuilder.pl line 147, at en +d of line syntax error at libbuilder.pl line 147, at EOF Execution of libbuilder.pl aborted due to compilation errors.

What gives? Where is my error?

Edit: g0n - added readmore tags

Replies are listed 'Best First'.
Re: Finding local vs Global Error
by Zaxo (Archbishop) on Sep 20, 2005 at 03:07 UTC

    You declare a lexical hash, %index in process_files(), but when you use it you treat it as a scalar hash reference, $index. That's a new undeclared variable to perl.

    Your editor can probably find the missing bracket. You'll be helped in that if you fix your indentation to something consistent.

    I don't see the constant in void context right away, but you probably will if you check line numbers accurately.

    After Compline,
    Zaxo

Re: Finding local vs Global Error
by Tanktalus (Canon) on Sep 20, 2005 at 03:13 UTC

    Based on those error messages, I'd say at least some of your errors are on or near lines 77, 109, and 110. Another error occurred somewhere there, too, but that missing curly/square bracket error is a bit loose on its guess.

    Lines 75-77 of what was posted:

    $dbh = ("DBI:mysql:host=localhost; database=blah", "blah", "blah", {PrintError => 0, RaiseError=>1});
    I think you're missing the "DBI->connect" part before the opening parenthesis. As for lines 109/110, on line 112, you use a variable $index which was not local'd or my'd anywhere prior. I'm not sure whether this is supposed to be a new variable or what scope you want it - probably right before the while loop for reading the file.

    Finally, your missing close-brace is that your process_files sub doesn't close properly - not as many }'s as {'s. I'm guessing that if you add a brace right before the line that says "sub usage", that would go away, too.

    Given the line numbers I'm getting here vs your error messages, I'd wager that the code you posted is not precisely the code you tested. That's something I'd like to highly discourage you from doing - it's much harder to debug a virtual problem than a concrete one. And it's even harder to debug a virtual problem that's masquerading as a concrete one via some misleading code.

Re: Finding local vs Global Error
by graff (Chancellor) on Sep 20, 2005 at 03:32 UTC
    The three occurrences of this warning:
    Useless use of a constant in void context at libbuilder.pl line 77.
    stem from using incorrect syntax for opening the DBI connection to your database. At line 77 of your script, this:
    $dbh = ("DBI:mysql:host=localhost; database=blah", "blah", "blah", {PrintError => 0, RaiseError=>1});
    should be something like this:
    $dbh = DBI->connect( "DBI:mysql:host=localhost;database=blah", $user, $passwd, {PrintError=>0,RaisError=>1});
    The other messages all stem from these lines, I think:
    $index->{$word} = ( exists $index->{$word} ? "$index->{$word}:" : "" ) . "$file_id" unless $seen_in_file{$word}++;
    and there's a lot of trouble there. First, it's good you are using strict, because you declared "index" as a hash ("my %index") but in these lines just cited, your using a scalar variable called "$index", and treating it as a reference to a hash. Make it a hash, or make it a scalar that you use as a hash-ref -- don't do it both ways.

    Another problem with those lines is the syntax. I think you want something like this:

    if ( ! $seen_in_file{$word} ) { $index{$word} .= ":" if ( exists( $index{$word} )); $index{$word} .= $file_id; $seen_in_file{$word}++; }
    Don't try using the ternary operator together with a post-conditional in the same statement -- it's too complicated, and you obviously did it wrong. So just keep it simple and clear.

    update: actually, the message about "missing right curly bracket" is because you really are missing a close-curly just before "sub usage" is declared. As mentioned in an earlier reply, you'll benefit from maintaining proper indentation as you edit your code. There are editors that make this relatively easy, once you learn how to use them.

Re: Finding local vs Global Error
by Cappadonna3030 (Sexton) on Sep 20, 2005 at 23:52 UTC
    <html> <body>

    Thanks for the guys. I'm getting few errors. Here's the new code:

    #!/usr/bin/perl -w use strict; use DBI; use File::Find; use Fcntl; use Getopt::Long; use Text::English; use constant TYPE_DEFAULT =>'article'; my (%opts, @files, $stop_words, $type); #User input GetOptions( \%opts, "dir=s", "stop=s", "ignore", "type=s", "numbers", "stem"); die usage() unless $opts{dir} && -d $opts{dir}; $opts{'type'} ||= TYPE_DEFAULT; #Get file names and build an array of files. find(sub{push @files, $File::Find::name}, $opts{dir}); $stop_words = load_stopwords($opts{stop}) if $opts{stop}; process_files(\@files, \%opts, $stop_words); sub load_stopwords { my $file = shift; my $words = {}; local *INFO, $_ or die "Can't open stop file: $file\n" unless -e $file; open INFO, $file or die "$!\n"; while(<INFO>) { next if /^#/; $words->{lc $1} = 1 if /(\S+)/; } close INFO; return $words; } sub process_files { #input variables: my($files, $opts, $stop_words) = @_; local( *FILE, $_ ); local $/ = "\n\n"; my $type = $opts{type}; my $dir = $opts{dir}; my %index; #Establish database variables: my($dbh, $sth1, $sth2); local(*FILE); local $/ = "\n\n"; my $file_id = 0; # initializing counter variable #Establish Database Connection $dbh = DBI->connect( "DBI:mysql:host=localhost;database=member +s", "gorillatrades", "kennyber", {PrintError=>0,RaiseE +rror=>1}); for ( my $file_id = 0; $file_id < @$files; $file_id++ ) { my $file = $files[$file_id]; my %seen_in_file; next unless -T $file; #print STDERR "Indexing $file\n"; #$index->{"!FILE_NAME:$file_id"} = $file; #Step 1: Create Library of Files: $sth1 = $dbh-> prepare("insert into library va +lues ($file, $dir, $type)"); $sth1-> execute(); open FILE, $file or die "Cannot open file: $file!\n"; while ( <FILE> ) { tr/A-Z/a-z/ if $opts{ignore}; s/<.+?>//gs; # Note this doesn't handle < or > in +comments or js while ( /([a-z\d]{2,})\b/gi ) { my $word = $1; next if $stop_words->{lc $word}; next if $word =~ /^\d+$/ && not $opts{number}; ( $word ) = Text::English::stem( $word ) if $o +pts{stem}; if ( ! $seen_in_file{$word} ) { $index{$word} .= ":" if ( exists( $in +dex{$word} )); $index{$word} .= $file_id; $seen_in_file{$word}++; } } #New Flava: Take Contents out of hash Table and into D +B foreach my $words (keys(%index)) { $sth2 = $dbh-> prepare('insert into catalog values ($words, $index{$words})'); $sth2 -> execute(); } } } } sub usage { my $usage = <<End_of_Usage; Usage: $0 -dir directory [options] The options are: -dir directory where files exist -ignore Case-insensitive index -stop Path to stopwords file -type Type of file, either email or article -numbers Include numbers in index -stem Stem words End_of_Usage return $usage; }

    Now, I'm getting these errors:

    perl libbuilder.pl -dir www/members/ -type= email Option type requires an argument DBD::mysql::st execute failed: You have an error in your SQL syntax ne +ar ' article)' at line 2 at libbuilder.pl line 91. Issuing rollback() for database handle being DESTROY'd without explici +t disconnect().
    I know the second error requires a proper call to destroy the db hanlder. What gives /w the constant error?
      $sth2 = $dbh->prepare( 'insert into catalog values ($words, $index{$words})' );
      Perl won't interpolate $words because it's inside single quotes. If $words is a string variable, even if it were interpolated, it wouldn't be proper SQL because string values need single quotes around them (in the SQL, not in the Perl). Easiest and best fix is placeholders:
      $sth2 = $dbh->prepare('INSERT INTO catalog VALUES(?,?)'); $sth2->execute($words,$index{$words});
        That worked, but I still get this error:
        Option type requires an argument DBD::mysql::st execute failed: You have an error in your SQL syntax ne +ar ' 0)' at line 2 at libbuilder.pl line 91. Issuing rollback() for database handle being DESTROY'd without explici +t disconnect().
        BTW, Here is the structure of my table in MYSQL:
        Database changed +----------+---------------------------------+------+-----+---------+- +------+ | Field | Type | Null | Key | Default | +Extra | +----------+---------------------------------+------+-----+---------+- +------+ | filename | varchar(60) | | PRI | | + | | filedir | varchar(80) | | | | + | | filetype | enum('article','email','other') | | | article | + | +----------+---------------------------------+------+-----+---------+- +------+ 3 rows in set (0.00 sec)
        Just wondering, since I'm using enum in MYSQL, can I represent my default variable as a 0? - Cappa
Re: Finding local vs Global Error
by Cappadonna3030 (Sexton) on Sep 21, 2005 at 02:33 UTC

    Hello: now my code show..........nothing. As is it never hits the functions.