Now I think I should say that I am not aware of any compiler that will compile code with nested comments, so this is probably not a big problem.
However I played around and this seems to do the trick: (just replace the while loop in my code above
with this one.)
while( $file =~ /\Q$start\E(.*?)\Q$end\E/sg )
{
$a = $1;
$match = $&;
#look for more start tags in what we matched
while( $a =~ /\Q$start\E/sg )
{
#balance the ending comments
$file =~ /.*?\Q$end\E/sg;
$match .= $&;
}
print $match, "\n";
}
For your tests file I got what you wanted.
For other tests I used this test.txt:
blah blah
/* comment 1 */
blah blah
/* comment 2 */
blah blah
/* outer
/*
mid
/*
center
*/
mid
*/
outer
*/
And here are my results:
prompt$ regex.pl '/*' '*/'
/* comment 1 */
/* comment 2 */
/* outer
/*
mid
/*
center
*/
mid
*/
outer
*/
So enjoy this fanciful result.
I hope this helps.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.