Rahul205 has asked for the wisdom of the Perl Monks concerning the following question:

How to remove characters of comment lines using regular expressions example of comment line: I want the output as Hello world only text has to be printed
<!-- Hello world-->
output: Hello world Plzzz anyone can help me out in doing this

Replies are listed 'Best First'.
Re: Perl pattern matching
by merlyn (Sage) on Apr 13, 2009 at 15:53 UTC
Re: Perl pattern matching
by CountZero (Bishop) on Apr 13, 2009 at 17:35 UTC
    Do not use regexes for this job. You will get bitten by all the edge-cases.

    One possible solution uses the XML::XPath-module and uses XPath expressions to traverse the xml-file.

    • / selects the document root
    • descendant-or-self:: selects all nodes in or below the root node (which was set using /)
    • The node test comment() is true for any comment node.
    /descendant-or-self::comment() therefore selects all comment nodes in or under the root node, i.e. all comment nodes in the whole file.
    use strict; use warnings; use XML::XPath; my $xml = q|<?xml version="1.0" ?> <xml><!-- A single line comment --> <class_list> <student> <name>Robert</name> <!-- A comment deeply inside the file --> <grade>A+</grade> </student> <!-- Here starts a multi line comment <student> <name>Lenard</name> <grade>A-</grade> </student> --> </class_list> </xml> |; my $xp = XML::XPath->new(xml => $xml); my $nodeset = $xp->find('/descendant-or-self::comment()'); foreach my $node ($nodeset->get_nodelist) { print "FOUND\n", $node->getValue, "\n"; }
    Output:
    FOUND A single line comment FOUND A comment deeply inside the file FOUND Here starts a multi line comment <student> <name>Lenard</name> <grade>A-</grade> </student>

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James