Hello,
I'm working on creating a code syntax conversion script that is filtering from an uncommon OOP language to C++.
Right now the filter is working okay - but I have a couple of issues that I'm hoping that someone here can help me with:
1) I am not able to successfully ignore comments
2) I am not able to successfully ignore multiline macros
Here's my general algorithm:
1) slurp in the source code file - into a single string
2) convert syntax
3) write out converted file
Originally - I was pulling the file into an array - and then converting the file line-by-line. This worked pretty well - but had issues with coding styles, where one user would write something like this:
class myclass {
and another might write
class
myclass {
So - to get through that I decided to slurp the entire file into a string. This allows me to search for the language legal patterns without making any assumptions about newlines - which are pretty much allowed anywhere.
BUT! With my line-by-line style I could simply skip (using next) any lines that started with //, were between /* .. */, or contained a \ (presumed to be a multiline macro).
Now that everything is one long string I'm having trouble figuring out how to do this.
Some specific examples:
Example 1:
A class in my language looks like this:
class foo;
blah blah;
endclass
Which I convert to something like this:
class foo {
blah blah;
}
No problem there.
s/\bclass(\s+)(\w+);/class $1 {/g;
s/endclass/};/
A macro in my language looks like this:
`define mymacro (blah blah) \
blah \
blah blah \
blah
I need to convert it to:
#define mymacro (blah blah) \
blah \
blah blah \
blah
Problem: sometimes the macro contains code that triggers other filters.
Example:
`define myclassmacro (blah) \
class myclass``blah ... \
blah \
endclass
So I guess my simple questions are:
1) How do I write a regular expression that can ignore a line based on another regular expression?
2) I want to define a regexp for a multline macro as:
starts with `define and ends with the first non-escaped newline. I tried:
my $multiline_preprocessor_macro = qr/^(.*?)(?!\\)\n/sm;
Thanks!
"chon"