in reply to Re: Getting Words out of HTML :)
in thread Getting Words out of HTML :)
&text($weight,$text) just breaks up each word passed($text) to it and throws it into the database, incrementing the count by the $weight. that function is working.. (for the most part)HTML::Parser->new(api_version => 3, handlers => [start => [\&tag, "self,tagname +,attr"], end => [\&tag_end, "self,tag +name,attr"], text => [\&text, "'$WEIGHT',d +text"] ], marked_sections => 1, )->parse($DATA) || die "Huh $!\n"; then my three subs: sub tag { my $self = shift; my $tagname = shift; my $attr = shift; my $stuff; if($tagname eq "meta") { if($attr{'name'} eq ("keywords" || "description")) { $stuff = +$attr{'content'}; &text($WEIGHT, $stuff); } } elsif($tagname eq "title") { $WEIGHT = "2"; } } sub tag_end { my $self = shift; my $tagname = shift; my $attr = shift; if($tagname eq "title") { $WEIGHT = "1"; } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
RE: RE: Re: Getting Words out of HTML :)
by reyjrar (Hermit) on Aug 30, 2000 at 19:16 UTC |