This class of problem may be addressed to some degree by the CPAN module, Text::Balanced. But it looks like you may run into the harder problem of parsing Perl. The PPI module can be helpful, though there are cases where even parsing is not as straightforward as one would expect. Regexes are not generally the appropriate solution for things like code parsing or balanced text parsing. You end up working way too hard on a regex solution that still falls short.

tchrist gave an excellent write-up on StackOverflow on why it is possible but usually inadvisable to use regexes as the primary engine in parsing non-trivial inputs (in the case of the writeup, he was talking about HTML, but the reply is applicable here as well). See Oh Yes You Can Use Regexes to Parse HTML!. It all boils down to the amount of work required to get a robust solution using regexes for this sort of thing will usually exceed the amount of work you will go through in using a proper parsing tool. It may seem like a lot of work learning to use these other tools, but not as much as it often takes to properly deal with all of the edge cases using only regexes.


Dave


In reply to Re: Multiline string and one line comments by davido
in thread Multiline string and one line comments by AskandLearn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.