huiling has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl -w use XML::DOM; use strict; my ($file, @rubbish) = @ARGV; if ($#ARGV > 0 ){ die; } else { my $parser = new XML::DOM::Parser; my $doc = $parser->parsefile($file); my $root = $doc->getDocumentElement(); scanner($root); sub scanner { my ($rt) = @_; my @list; foreach my $track ($rt->getElementsByTagName('track')) { foreach my $el ( $track->getChildNodes()) { foreach my $attr ( $el->getChildNodes()) { if ($attr->getNodeType == ELEMENT_NODE) { if ($attr-> getAttribute('name') eq "type"){ foreach my $type ($attr->getChildNodes()){ push (@list, $type->getData()); } } } } } }print "@list"; } }
I get
description/assertion command etc,
as output how do i get rid of the lines? (i.e. i want "description/assertion command", etc) my file looks sth like this:
<track name="linguist.segments" type="primary"> - <el index="0" start="0.44" end="0.56"> <attribute name="semantics">abstract</attribute> <attribute name="type">description/assertion</attribute> </el> - <el index="1" start="0.56" end="0.76"> <attribute name="semantics">abstract</attribute> <attribute name="type">command</attribute> </el> </track>
thanks so much! hi, thanks to all who replied, i opened my source code and realised it's different from what it shows on the xml editor.
<track name="linguist.segments" type="primary"> - <el index="0" start="0.44" end="0.56"> <attribute name="semantics"> abstract </attribute> <attribute name="type"> description/assertion </attribute> </el> - <el index="1" start="0.56" end="0.76"> <attribute name="semantics"> abstract </attribute> <attribute name="type"> command </attribute> </el> </track>
i used $temp =~ s/\s//g; in the end, and it works now. i feel like such an idiot now. =)

Replies are listed 'Best First'.
Re: why spaces! XML::DOM
by pjotrik (Friar) on Jul 10, 2008 at 09:48 UTC

    Is this really the code that doesn't work? It does exactly what you want on my computer( prints description/assertion command).

    The problem might arise from the fact that getChildNodes returns not only element nodes, but includes text nodes containing the surrounding whitespaces. But as I say, for the code and file you pasted, it does work.

Re: why spaces! XML::DOM
by Lawliet (Curate) on Jul 10, 2008 at 09:57 UTC

    It seems as if each element in the array @list contains two newline characters after them. You can remove them with a loop

    my $element $element = substr $element, 0, -4 foreach $element @list; #You will still need to add the space though. #This solution is not that plausible.

    Also, I do not think the else {} part of your code is needed. If the condition is met, the program will die, ergo no need for an else.

    Update: Just realized this could also be done with a substitute function:

    $element =~ s/\\n\\n/ /;
    <(^.^-<) <(-^.^<) <(-^.^-)> (>^.^-)> (>-^.^)>

      Oh come on, that's not very pretty... s/^\s+// and s/\s+$// would do nicely.

      But I still don't think that's the problem here

        Yeah, I read your response after I posted mine. (And shouldn't it be s/\s+$/ / ? Notice the space.)

        <(^.^-<) <(-^.^<) <(-^.^-)> (>^.^-)> (>-^.^)>