Although counting curlies might succeed quite foten, it will fail if you somehwre have a string constant containing a an unbalanced curly or acomment.
public void readExternal () {
... '}' /* this } is NOT the end as well */
}
So it seems you can't get around a basic tokenizer that can differenciate code curlies from string-constant curlies. At least Java syntax is not as hard to parse as Perl, the only way to create a place where a curly does not close a block is in "" and '', after // and between /* and */, there's no qq,q,qr,qw or s///,m//,tr/// ....
A quick solution (demonstrating only basic functionaility):
use Regexp::Common;
local $_ = join '',<>;
my $code = '';
my $curlies = 1; #one curly open
while( m# \G ( [^{}'"/]* ) #xg ){
$code .= $1;
my $p = pos;
# '' + ""
if( m# \G ( $RE{quoted} ) #xg ){
$code .= $1;
next;
}
pos = $p;
# /* */
if( m# \G ( $RE{balanced}{-begin=>'/*'}{-end=>'*/'} ) #xg ){
$code .= $1;
next;
}
pos = $p;
#//
if( m# \G ( // [^\n]* ) #xg ){
$code .= $1;
next;
}
pos = $p;
# {
if( m# \G \{ #xg ){
$code .= '{';
++ $curlies;
next;
}
pos = $p;
# }
if( m# \G \} #xg ){
$code .= '}';
-- $curlies;
last unless $curlies;
next;
}
pos = $p;
m# \G ( . ) #sxg or last;
$code .= $1;
}
print "CODE\n$code\n";
--
http://fruiture.de |