Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I am new to perl. I am writing one code . It is used to remove the comments from ruby code. I have tried the code with regular expression. I have handled some of the cases. I unable to handle the quotes (Single and multiple ) in the code. I want to handle the following cases,
#I want to remove this line. #"#This line too" puts "Hello, Ruby!" #Remove this comment alone.",,,,,....'''"#Consider + this content too. puts "############I don't want to remove this content bcoz it is in qu +otes#" #But remove this content. puts "This is for testing ##sdfsfsf" ###Remove this content. puts 'This is for testing ##testing' #I want to remove this content al +one'
I have tried the following expression, $_=~s/^#.*$//; $_ =~ s/([^"|'](#)+.*[^"|'])\s*#.*$/$1/; $_ =~ s/[^"|']#.*[^"|']$//;
Desired output: puts "Hello, Ruby!" ",,,,,....'''" puts "############I don't want to remove this content bcoz it is in qu +otes#" puts "This is for testing ##sdfsfsf" puts 'This is for testing ##testing'
Please any one give me the correct regular expression to fulfill my requirement. Actually without quotes I have handled the cases easily. But I want to handle the quotes too. Thanks in advance...

Replies are listed 'Best First'.
Re: Regular expression..
by Anonymous Monk on Oct 29, 2010 at 07:55 UTC
Re: Regular expression..
by JavaFan (Canon) on Oct 29, 2010 at 09:10 UTC
    In general, to be able to correctly delete comments from a language, you need to be able to fully or partially parse it. I seriously doubt you'll be able to remove comments from Ruby with a handful of simple regexes. You certainly will not be able to do this correctly by looking it line by line.
      Yes, I parse the file line by line only but I unable to write the regular expression correctly. Can you give me the regular expression to solve my requirement. You consider that I pares the file line by line and want to remove the comments from the file.
        Yes, I parse the file line by line only but I unable to write the regular expression correctly.
        Which part of You certainly will not be able to do this correctly by looking it line by line do you fail to understand?
        Can you give me the regular expression to solve my requirement.
        Which part of I seriously doubt you'll be able to remove comments from Ruby with a handful of simple regexes do you fail to understand?
Re: Regular expression..
by raybies (Chaplain) on Oct 29, 2010 at 16:58 UTC
    Here's my crack at it: (adapted from something I saw regarding CPP...) I don't know ruby, but here's my best guess to what you were saying. (# to end of line. This also accounts for quotes that span multiple lines. )
    #!/usr/bin/perl $/ = undef; $_ = <>; s/([^#"']*(['][^']*['])|(["][^"]*["]))|([#][^\n]*)/defined $1 ? $1 : " +"/gse; print;
    Here's the input I tried.
    # comment. "this is a quote" # end of line comment. print "this is a multiline quite. #right? "; No comments at all. This is just general stuff. #this is a 'test too." '
    and the output was:
    "this is a quote" print "this is a multiline quote. # right? "; No comments at all. This is just general stuff.
    Whaddya think? (I hope i typed the code in right. I do this on a separate machine)...
      Is'n there something like qq() in Ruby? (Google suggests W(), %() and %%q{} and some more...) It would complicate the situation a fair bit.

        I was thinking if RUBY had Here documents, like perl does, that'd screw things up too... which doing a quick wiki search, it appears Ruby supports Here docs.

        --Ray update: of course, a regex with a # in it would also screw it up, which I imagine ruby supports too. So yeah... not a perfect solution.

Re: Regular expression..
by LanX (Saint) on Oct 30, 2010 at 00:54 UTC
    Grin! xD

    ... this reminds me of a guy who used perltidy for beautifying php-code.

    His biggest problem were the inline-comments and he used a regex to translate // to # and back... =)

    Cheers Rolf