Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Comments detected in strings

by astroboy (Chaplain)
on Oct 20, 2014 at 21:09 UTC ( [id://1104498]=perlquestion: print w/replies, xml ) Need Help??

astroboy has asked for the wisdom of the Perl Monks concerning the following question:

Hi all - I'm trying to identify comments in code (so that ultimately I'll strip them out). I've tried using Regexp::Common::comment, but it appears to detect comments in strings when they should really be ignored. Am I using it wrong, or is it a limitation of the module?
use strict; use Regexp::Common qw /comment/; my @tests = ( { language => 'PL/SQL', description => 'PL/SQL Comment in String', code => q{ declare j varchar2(2); begin j := '--'; end; } }, { language => 'PL/SQL', description => 'PL/SQL no comment', code => q{ declare j varchar2(2); begin j := 'xx'; end; } }, { language => 'SQL', description => 'SQL Comment in String', code => q{ select '--' from dual; } }, { language => 'SQL', description => 'SQL no comment', code => q{ select 'xx' from dual; } }, { language => 'Perl', description => 'Perl Comment in String', code => q{ my $j = '#'; } }, { language => 'Perl', description => 'Perl no comment', code => q{ my $j = 'xx'; } } ); foreach my $test (@tests) { print $test->{description}."\n"; if ($RE{comment}{$test->{language}}->matches($test->{code})) { print "\tcontains comment\n"; } else { print "\tno comment\n"; } }
which outputs:
PL/SQL Comment in String contains comment PL/SQL no comment no comment SQL Comment in String contains comment SQL no comment no comment Perl Comment in String contains comment Perl no comment no comment

Replies are listed 'Best First'.
Re: Comments detected in strings (|)
by tye (Sage) on Oct 20, 2014 at 21:54 UTC

    To skip comment-like sequences in strings, you need to parse both strings and comments (and anything else that might contain something that looks like a string or looks like a comment). Most likely with something like:

    /$comment|$string/g

    I tend to end up parsing with code like:

    while( ! /\G$/gc ) { if( /\G$comment/gc ) { ... } elsif( /\G$string/gc ) { ... } elsif( /\G$neither/gc ) {

    - tye        

      Thanks for your reply. So I guess that Regexp::Common::comment doesn't have this out of the box? The reason I looked at using it was because I was under the apprehension that its purpose was to stop us reinventing the wheel and it also handled edge cases - i.e. it just did things right. If all it's doing is recognising comment characters, does it serve any purpose? I'm not trying to be critical, I'm just trying to understand what it was hoping to achieve

        Thanks for your reply. So I guess that Regexp::Common::comment doesn't have this out of the box? The reason I looked at using it was because I was under the apprehension that its purpose was to stop us reinventing the wheel and it also handled edge cases - i.e. it just did things right. If all it's doing is recognising comment characters, does it serve any purpose? I'm not trying to be critical, I'm just trying to understand what it was hoping to achieve

        Um, that's weird, did you see what the docs say?

        Looks to me like it does the job it promises to do ... its not a language parser like PPI, its parts you can use to build a language parser

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1104498]
Approved by Loops
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2024-04-25 04:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found