While I entirely agree with davido's exhortation to use a proper parser for HTML, I will answer your question because it is (in this one instance) fairly trivial. The capture group does not match because you have used the asterisk as the quantifier after it. This matches zero or more instances, and zero is, of course, the shortest.

Here's your code with a few small tweaks and the key change of using the plus as the quantifier:

#!/usr/bin/perl use strict; use warnings; my $line = '<div id="roguebin-response-35911" class="bin-response"></d +iv>'; if ($line =~ /<div.+?(<\/div)+/) { print "line matched\n"; if (defined $1) { print "right after match, 1 is defined\n"; } }

Similarly you don't really need a quantifier at all here because there is only one closing div in the string and one is the default quantity of anything in a regex.

I've used print instead of printf because you are not doing any format conversion. I've removed some unnecessary brackets and have used single quotes to delimit the initial string so the internal double quotes no longer need escaping (and you aren't interpolating in this string either).

But seriously, use a parser.


🦛


In reply to Re: problem with optional capture group by hippo
in thread problem with optional capture group by Special_K

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.