in reply to meta parsing problems

this isnt fool proof, but it'll probably do what you want...
foreach( $content =~ m#<meta (.*?)>#sgoi ) { my $name = $1 if( $_ =~ m#name\s*?=\s*?["'](.*?)["']#sgoi); my $cont = $1 if( $_ =~ m#content\s*?=\s*?["'](.*?)["']#sgoi); }

time was, I could move my arms like a bird and...

Replies are listed 'Best First'.
Re^2: meta parsing problems
by Anonymous Monk on Oct 20, 2007 at 11:33 UTC
    Eventhough the thread's a bit old... There are problems with this approach. You should rellay consider using HTML::TreeBuilder, it's as easy as
    use HTML::TreeBuilder; my $tree = HTML::TreeBuilder->new()->parse($data); for my $tag ($tree->look_down( _tag => "meta")) { $kWords{$tag->attr("name")} = $tag->attr("content"); }
    The above code takes care of spaces/linebreaks &s.o. And its fast and widely used. Just my 5cents. FJ