OK, I'll try my best at explaining the "unrolling the loop" mechanism (I don't have MRE at hand, so feel free to correct me if I am blatantly wrong!):

I use this technique in 2 cases:

In the first case here is how you want to match:

Now if you want to match a string with a multi-character end delimiter here is how to do it:

A potential pitfall is that you want to make sure you don't consume the characters just after the first character of the end delimiter, or things like **/ (the first character of the end delimiter is there twice in a row, once as a regular character and once as the start of the end delimiter) would not be processed properly.

I guess a couple of examples might be appropriate.

First matching a double-quoted string, double quotes can be escaped using \":

#!/bin/perl -w use strict; while( <DATA>) { next if(/^\s*#/); # skip comments in DATA chomp; # split data into string to match and expected result(s) my( $string, @expected)= split /\s*=>\s*/; while( $string=~ m{" # the start deli +miter ([^\\"]* # anything but t +he end of the string or the escape char (?:\\. # the escape + char preceeding an escaped char (any char) [^\\"]* # anything b +ut the end of the string or the escape char )*) # repeat "}gx) # the end delimi +ter { my $match= $1; my $expected= shift @expected; unless( $match eq $expected) { print "unexpected result line $.: found /$match/, expectin +g /$expected/\n"; } } } __DATA__ # string to match => expected results(s) toto "a string" tata => a string toto "a string" => a string toto "a string" tata => a string toto "a \" string" tata => a \" string toto "\" string" tata => \" string toto "a\"" tata => a\" toto "\"" tata => \" toto "\"\"" tata => \"\" toto "string 1" "string 2" tata => string 1 => string 2 toto "string 1 => toto "string 1" "string => string 1 toto "tata\\" tutu => tata\\ toto "tata\\\"" tutu => tata\\\"

And now how to match C-like comments:

#!/bin/perl -w use strict; while( <DATA>) { chomp; next if(^\s*#); # skip comments # split the data into the string to match and the expected result( +s) my( $string, @expected)= split /\s*=>\s*/; while( $string=~ m{/\* # the delimite +r ([^*]* # anything but + the beginning of the delimiter (?:\*(?!>/) # the beginn +ing of the delimiter, not preceeding the rest of the delimiter # (?!>/) +means "not before /, do not use the next char) [^/]* # anything b +ut the beginning of the delimiter )*) # repeat \*/}gx) # the end of t +he delimiter { my $match= $1; my $expected= shift @expected || ''; unless( $match eq $expected) { print "unexpected result line $.: found /$match/, expectin +g /$expected/\n"; } } } __DATA__ # string => result(s) toto => toto /*foo*/ tata => foo /*foo*/ tata => foo toto /*foo*/ => foo toto /*foo*bar*/ => foo*bar toto /*foo**/ => foo* toto /**/ => /***/ => * /*/*/ => /

In reply to Re: Unrolling the loop technique by mirod
in thread Unrolling the loop technique by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.