In this case /*(.*?)*/ would work but it might be slow (I think, I have not benchmarked it).

I would strongly prefer m#/\*(.*?)\*/#s over the unrolled loop version if that is the whole regex (and I'd try to make that the whole regex precisely because I could then avoid using the unrolled regex).

The real problem with this simple technique comes when you try to use it as part of a larger regex. For example, let's say you want to extract "comment blocks", that is, a C-style comment that starts at the beginning of a line and ends at the end of a line. Using m#^/\*(.*?)\*/$#msg sure seems an easy way, and it even works for a lot of cases. However, consider this unlikely sample input:

/* This is correctly matched */ /* This: */ runcode(); /* gets included in the "comment" */
which would return this list:
( " This is correctly matched ", ' This: */ runcode(); /* gets included in the "comment" ' )
You see that .*? matches as little as possible but will prefer to match more if matching more will allow the entire regex to match (or to match earlier) when less causes the regex as a whole to fail (or to match later).

If I find myself wanting to use the loop unrolling technique, then I usually try to rework the problem by parsing in smaller chunks. Though, if these chunks start getting really small (like my parser starts having to deal with single characters in lots of cases), then I may use some of the simplest examples of unrolled regex loops.

        - tye (but my friends call me "Tye")

In reply to (tye)Re: Unrolling the loop technique by tye
in thread Unrolling the loop technique by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.