Larry Wall once said: "Only perl can parse Perl", which seems to be true.
I think there is no regex solution to this problem, and it needs a strict definition of what a statement is. (with
if (foo) { bar; }, "bar" is the only statement according to some (that's what I believe), but some people say the entire block or the entire if-construct including the block is the statement. Some call every expression a statement, but I don't like thinking of
$foo = $bar + 3 as 5 statements.)
However, a great program called
Perltidy carefully tokenizates Perl sources, and reformats them into a tidy output. If you have a look at their code, you know why it's so hard to parse Perl.
2;0 juerd@ouranos:~$ perl -e'undef christmas'
Segmentation fault
2;139 juerd@ouranos:~$