Re: regex catch pattern that doesn't contain a pattern

To shed some light, tip #4 from the Basic debugging checklist (Data::Dumper):

use warnings;
use strict;
use Data::Dumper;

"<div></div>" =~ /(?<start>.*?)((?!\< *\/[\w\d\-]+\>).)*/;
print Dumper(\%+);

__END__

$VAR1 = {
          'start' => ''
        };
[download]

Tip #9: YAPE::Regex::Explain

----------------------------------------------------------------------
  .*?                      any character except \n (0 or more times
                           (matching the least amount possible))
----------------------------------------------------------------------
[download]

Your regex is telling it that nothing is a valid match. Have you considered using an HTML parser module from CPAN?

Comment on Re: regex catch pattern that doesn't contain a pattern Select or Download Code

Replies are listed 'Best First'.
Re^2: regex catch pattern that doesn't contain a pattern by AnomalousMonk (Archbishop) on Apr 29, 2015 at 18:03 UTC
I agree that the OPed regex will match and capture the first empty string it finds (i.e., the one at the beginning of the string), but is YAPE::Regex::Explain at all valid for constructs such as `(?<NAME>pattern)` introduced with Perl version 5.10? Give a man a fish: `<%-(-(-(-<`	[reply] [d/l] [select]
Re^3: regex catch pattern that doesn't contain a pattern by toolic (Bishop) on Apr 29, 2015 at 18:08 UTC
but is YAPE::Regex::Explain at all valid for constructs such as (?<NAME>pattern) introduced with Perl version 5.10? Nope. According to the POD (LIMITATIONS): There is no support for regular expression syntax added after Perl version 5.6, particularly any constructs added in 5.10. But, it is valid for `.*?`	[reply] [d/l]
Re^4: regex catch pattern that doesn't contain a pattern by AnomalousMonk (Archbishop) on Apr 29, 2015 at 18:19 UTC
But the `.?` is wrapped in a `(?<start>.?)` Is something like that explained reliably in all cases? Give a man a fish: `<%-(-(-(-<`	[reply] [d/l] [select]
Re^2: regex catch pattern that doesn't contain a pattern by Anonymous Monk on Apr 29, 2015 at 22:42 UTC
There is always wxPPIxregexplain.pl/ ppixregexplain.pl Read more... (13 kB) And rxrx `(?<start> # The start of a named capturing block (also $1)` `.? # Match any character (except newline), zero-or-more times (as few as possible)` `) # The end of the named capturing block` `( # The start of a capturing block ($2)` `(?! # Match negative lookahead` `\< # Match a literal '<' character, zero-or-more times (as many as possible)` `/ # Match a literal '/' character` `[\w\d\-]+ # Match any of the listed characters, one-or-more times (as many as possible)` `\> # Match a literal '>' character` `) # The end of negative lookahead` `. # Match any character (except newline)` `)* # The end of $2 (matching zero-or-more times (as many as possible))` Neither are perfect but they're almost perfect :P	[reply] [d/l] [select]