in reply to Search Engine troubles

It prob never finds a title because its looking for the word TITLE. Add a 'i' to the regex where searing for it.
To solve your other problem look into the HTML::Parser especially their htext example.


Update:

Ohh well. I might as well post the code

Update2:

Updated the code to actually work
#!/usr/bin/perl -w use HTML::Parser; #The following code deals with the form data if ($ENV{'REQUEST_METHOD'} eq 'POST') { read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}); @pairs =split(/&/, $buffer); foreach $pair (@pairs) { ($name, $value) = split(/=/, $pair); $value =~ tr/+/ /; $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $FORM{$name}= $value; } } #######instead of the above code use the CGI module. #HTML::Parser stuff. Stolen from the HTML::Parser package's htext #example my $keyword; my $hit; my $title; my %inside; sub htmltag { my($tag, $num) = @_; $inside{$tag} += $num; } sub htmltext { return if $inside{script} || $inside{style}; $title = $_[0] if ($inside{title}); $hit = 1 if ($_[0] =~ /$keyword/); } my $parser = HTML::Parser->new( 'api_version' => 3, 'handlers' => [start => [\&htmltag, "tagname, '+1'"], end => [\&htmltag, "tagname, '-1'"], text => [\&htmltext, "dtext"], ], 'marked_sections' => 1); $keyword=$FORM{keyword}; print "Content-type: text/html\n\n"; print "<h2> Here are the files we found</h2>\n\n"; chdir("/usr/local/etc/httpd"); opendir(DIR, "."); while($file = readdir(DIR)) { next if ($file !~ /.htm/); $hit = 0; $title = ""; %inside = (); open(FILE, $file) || die "couldnt open $!"; $parser->parse(\*FILE)->eof; close(FILE); if($hit) { $title = $file unless ($title); print "<A HREF=/$file>$title</A><BR>"; $listed=1; } }


T I M T O W T D I

Replies are listed 'Best First'.
Re: Re: Search Engine troubles
by divinus (Acolyte) on Aug 21, 2001 at 23:14 UTC
    Thanks, I will get to work with the code you gave me and see if I get it to work but I tried the advice about the title tag and the search results gave me completely blank lines instead of the file names. Then again, I dont know what a regex is nor do I know what searing is. haha. But this is what I tried.
    if (/<TITLE>/i)
    Is that what you had in mind? Thanks for reesponding. Divinus
      searing is just me not able to spell ;) Offcourse it should have said searching, but you understood me fine I can see ;)

      T I M T O W T D I
        Hmmm... Ok I am understanding the htmltext function and a good degree of the htmltag function but I am lost on the my parser and foreach sections. Any help or explanation of these would be good. Thanks for the update as well. My primary concern is too get this thing working but I really do want to learn this stuff so I can answer questions instead of asking them. haha. Also, I actually didn't realize you made a misspelling. I thought about it for about 5 minutes trying to figure out what that was. haha. And I still dont know what the regex is, correct spelling or not. Thanks. Divinus