chuckd has asked for the wisdom of the Perl Monks concerning the following question:

i have a string with a bunch of href tags strung together
I tried to pull them out individually into an array like
my @tags = $content =~ m%(<a href="http://dynamodata.*/a>)%g;
but it just puts them all into one string in $tags[0]
can anyone help?

Original content restored above by GrandFather

i figured it out!

Replies are listed 'Best First'.
Re: reg exp question
by ikegami (Patriarch) on Jul 28, 2009 at 23:04 UTC
    my $tags = join '', $content =~ m%(<a href="http://dynamodata.*/a>)%g;
Re: reg exp question
by ww (Archbishop) on Jul 28, 2009 at 23:22 UTC
    split

    Customarily you'ld want to use one of the modules designed to parse .html, but if your data is exactly as stated (are you sure?) split may be a workable approach.

Re: reg exp question
by quester (Vicar) on Jul 29, 2009 at 09:35 UTC
    Since no one seems to have mentioned it so far...

    The basic problem here is greedy matching. The ".*" is a greedy match so "m%(<a href="http://dynamodata.*/a>)%g" will match exactly once, with ".*" matching everything between the first occurrence of "<a href="http://dynamodata" and the last occurrence of "/a>". If, as in this case, you want to match the first following occurrence of "/a>" you can use the non-greedy form ".*?". So,
    @tags = $content =~ m%(<a href="http://dynamodata.*?/a>)%g;
    should work (not tested, since I'm not at the right machine at the moment....)
Re: reg exp question
by hnd (Scribe) on Jul 29, 2009 at 11:46 UTC
    why not use HTML::TokeParser... its better to use that as it extracts HTML string exactly as you want.... just type HTML::TokeParser in google and the first (or maybe second) link would be to the perl monks tokeparser tutorial...
    "just keep it simple, stupid" :)
    =====================================================
    i'am worst at what do best and for this gift i fell blessed...
    i found it hard it's hard to find well whatever
    NEVERMIND