In the process of trying to emulate the C pre-processor I had major trouble trying to handle C style /* ... */ comments. There are two issues that cause particular grief - comments can span lines and, at least for some compilers, comments can be nested (and are in the code I need to handle).
An additional gotcha is that things that look like comments in strings need to be retained.
The code below parses an input string and generates an output string comprising the original text sans C style comments. Note that it leaves C++ single line comments however - but they are easily dealt with in the second pass.
use strict; use warnings; use Parse::RecDescent; my $decommendedText = ''; sub concat ($) {$decommendedText .= $_[0]; 1;} my $decomment = <<'GRAMMAR'; file : block(s) block : string {::concat ($item{string}); 1} | m{((?!/\*|"|').)+}s {::concat ($item[-1]); 1} | comment {::concat ($item{comment}); 1;} string : /"([^"]|\\")*"/ {$return = $item[-1] . ($text =~ /^\n/ ? "\n" : ''); 1;} | /'([^']|\\')*'/ {$return = $item[-1] . ($text =~ /^\n/ ? "\n" : ''); 1;} comment : '/*' commentBlock '*/' {$return = $text =~ /^\n/ ? "\n" : ''; 1;} commentBlock : m{((?! \*/ | /\* ).)*}sx comment m{((?! \*/ | /\* ). +)*}sx {$return = "\n"; 1;} | m{((?! \*/ | /\* ).)+}sx {$return = ''; 1;} GRAMMAR my $parse = new Parse::RecDescent ($decomment); my $input = <<'DATA'; #include "StdAfx.h" // Tail comment #include "Utility\perftime.h" #pragma hdrstop /* Comment before MACRO */ /* Comment /* and nested comment */ lines */ #define MACRO 10\ + 3 // Multi line macro with comment #define __DEBUG /* comment */ 1 #define STRING 'This is a string' /* comment */ #define COMMENT "/* comment in \"a\" string */" // c++ comment line /* Comment at start for a number of lines */ /* multi-line comment /* nested */ block */ // cpp block char PerfTimer::Buf[64]; DATA $parse->file($input) or die "Parse failed\n"; print $decommendedText;
Prints:
#include "StdAfx.h"// Tail comment #include "Utility\perftime.h" #pragma hdrstop #define MACRO 10\ + 3 // Multi line macro with comment #define __DEBUG 1 #define STRING 'This is a string' #define COMMENT "/* comment in \"a\" string */" // c++ comment line // cpp block char PerfTimer::Buf[64];
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: C comment stripping preprocessor (problems)
by tye (Sage) on Aug 09, 2006 at 19:20 UTC | |
Re: C comment stripping preprocessor
by ikegami (Patriarch) on Aug 09, 2006 at 18:34 UTC | |
by GrandFather (Saint) on Aug 09, 2006 at 18:37 UTC | |
by ikegami (Patriarch) on Aug 09, 2006 at 18:43 UTC | |
Re: C comment stripping preprocessor
by ForgotPasswordAgain (Vicar) on Aug 10, 2006 at 10:32 UTC | |
by GrandFather (Saint) on Aug 10, 2006 at 10:42 UTC |