in reply to Question why this Regex isn't matching

So I ended up doing
my @pieces4 = split /<|>/ , $pieces[4]; my $title = $pieces4[3];
but I am still curious as to why my match wasn't working. Thanks! TMTOWTDI!

I like computer programming because it's like Legos for the mind.

Replies are listed 'Best First'.
Re^2: Question why this Regex isn't matching
by AnomalousMonk (Archbishop) on Sep 30, 2011 at 17:46 UTC
    $pieces[4] =~ m/>(^<+)</;

    This regex wants a '>' character followed by ^ (hat metacharacter), the start of the string! That's not likely to occur in any string unless the  /m regex modifier is used to allow ^ to match with embedded newlines (Update: Actually, even that won't happen. The match would have to be with something like  / > \n ^ /xm because with the /m switch ^ will only match immediately after a newline or at the very start of the string). Did you perhaps mean something like  m/>([^<]+)</?

      YES! I thought that the rules were the same for () as []. Thanks for clearing that up. And yes, the latter is what I want because I want to group and capture that part of the match into $1.

      I like computer programming because it's like Legos for the mind.

        There are far better ways to achieve your goal than using regexen.

        Parsing HTML is notoriously fraught with difficulties; the more so, when that HTML is not compliant with well-known standards (4.10 strict; 4.01 loose in particular) That means rolling your own flies in the face of the caution against re-inventing wheels.

        To minimize you problems, take a look at the various modules built for the job. A search of CPAN (or ActiveState with ppm if you're on Windows and using AS's Perl) will present a wealth of well-tested and stable (reliable) options.

        HTML::Parser, HTML::TableParser, and HTML::Extract are just a few of the many that may suit your needs.