in reply to Re: Question why this Regex isn't matching
in thread Question why this Regex isn't matching

$pieces[4] =~ m/>(^<+)</;

This regex wants a '>' character followed by ^ (hat metacharacter), the start of the string! That's not likely to occur in any string unless the  /m regex modifier is used to allow ^ to match with embedded newlines (Update: Actually, even that won't happen. The match would have to be with something like  / > \n ^ /xm because with the /m switch ^ will only match immediately after a newline or at the very start of the string). Did you perhaps mean something like  m/>([^<]+)</?

Replies are listed 'Best First'.
Re^3: Question why this Regex isn't matching
by OfficeLinebacker (Chaplain) on Sep 30, 2011 at 17:53 UTC
    YES! I thought that the rules were the same for () as []. Thanks for clearing that up. And yes, the latter is what I want because I want to group and capture that part of the match into $1.

    I like computer programming because it's like Legos for the mind.

      There are far better ways to achieve your goal than using regexen.

      Parsing HTML is notoriously fraught with difficulties; the more so, when that HTML is not compliant with well-known standards (4.10 strict; 4.01 loose in particular) That means rolling your own flies in the face of the caution against re-inventing wheels.

      To minimize you problems, take a look at the various modules built for the job. A search of CPAN (or ActiveState with ppm if you're on Windows and using AS's Perl) will present a wealth of well-tested and stable (reliable) options.

      HTML::Parser, HTML::TableParser, and HTML::Extract are just a few of the many that may suit your needs.