rahulgsp83 has asked for the wisdom of the Perl Monks concerning the following question:

hi, i am trying to match below few lines in an xml file
<item name="pdf_link"> <value>http://www.yahoo.com</value> </item>
with this regex
if($line=~/(.*)(<item name=\"pdf_link\">\s*<value>)(.*)(<\/value>)(.*) +/i)
the control flow is parsing that line ,but regex is not matching ,i need the value between <value> tags, ideally if i write print $3 it should give me that value but it printing blank , looking forward for some guidance from experts, regards, rahul

Replies are listed 'Best First'.
Re: regex matching issue (XML parse)
by toolic (Bishop) on Mar 09, 2010 at 14:39 UTC
    Using regular expressions to parse XML is more trouble than it's worth (see XML parsing vs Regular expressions). Consider using a parser such as XML::Twig:
    use strict; use warnings; use XML::Twig; my $xmlStr = <<XML; <item name="pdf_link"> <value>http://www.yahoo.com</value> </item> XML my $twig= XML::Twig->new( twig_handlers => { item => \&item } ); $twig->parse($xmlStr); sub item { my ($twig, $item) = @_; if ($item->att('name') eq 'pdf_link') { print $item->first_child('value')->text(), "\n"; } } __END__ http://www.yahoo.com
Re: regex matching issue
by LanX (Saint) on Mar 09, 2010 at 14:36 UTC
    Do I understand you the if is true but $3 doesn't get printed?

    Without knowing the whole file I would recommend changing (.*) to (.+?) meaning "match at least one character but non-greedy"!

    In general you should consider using one of the XML-moduls on CPAN...

    Cheers Rolf

Re: regex matching issue
by BioLion (Curate) on Mar 09, 2010 at 14:51 UTC

    Works for me:

    my $line = <<'END' <item name="pdf_link"> <value>http://www.yahoo.com</value> </item> END ; if($line=~m/(.*)(\<item name\=\"pdf_link\"\>\s*\<value\>)(.*)(\<\/valu +e\>)(.*)/i){ print "3: $3\n"; } else { print "No match\n"; } __END__ 3: http://www.yahoo.com

    But as people have already pointed out, an XML parser is probably a more reliable bet!

    Just a something something...
Re: regex matching issue
by Ratazong (Monsignor) on Mar 09, 2010 at 14:41 UTC

    Just a guess: You call your variable $line, but you have a multi-line-input. Have you doublechecked the whole text is in that variable?

    HTH, Rata
      thanks for all the replies ,you are right ,i am having a multi line input ,i checked all the data is in that variable ,is there any special quantifier in case of multi which we need to care of? Regards, Rahul
Re: regex matching issue
by FunkyMonk (Bishop) on Mar 09, 2010 at 14:53 UTC
    Works for me!

    use Data::Dump 'pp'; my $line = q{<item name="pdf_link"> <value>http://www.yahoo.com</value> </item>}; pp $line=~/(.*)(<item name=\"pdf_link\">\s*<value>)(.*)(<\/value>)(.*) +/i; __END__ ( "", "<item name=\"pdf_link\">\n <value>", "http://www.yahoo.com", "</value>", "", )