in reply to Pretty cool link extractor.
<a href="http://foo.com">bar</a> <a href="index.html">index</a>Maybe i am missing something, but i think that the regex you use to 'remove all HTML tags' isn't working the way you think it should. Here is how i would do it:
But i would NEVER use that in any serious code (it has its limitations - only one link per line). I would use a module. Now, why people think that writing code to bypass using a module (that has already been tested and used by many, many people around the world) is a 'good thing' elludes me. Is it because you don't have permission? Then please read A Guide to Installing Modules - there is no excuse.use strict; use Data::Dumper; my @link; my @data = <DATA>; for (@data) { my ($url,$label) = $_ =~ /href\s*=\s*"([^"]+)"\s*>([^<]+)/; next unless $url and $label; push @link, [$url,$label]; } print Dumper \@link; __DATA__ <a href="http://foo.com">bar</a> <a href="index.html">index</a>
jeffa
L-LL-L--L-LL-L--L-LL-L-- -R--R-RR-R--R-RR-R--R-RR B--B--B--B--B--B--B--B-- H---H---H---H---H---H--- (the triplet paradiddle with high-hat)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: (jeffa) Re: Pretty cool link extractor.
by gav^ (Curate) on Mar 26, 2002 at 04:23 UTC |