in reply to Another problem with XML parser
Do you have multiple elements with a substring of "Header"? You could add anchors to the element match: $element =~ /^Header$/i.
Can you have a "Number" element that resides outside of a "Header" element? That could be why you see double numbers. Try adding a flag so that the "Char" routine only checks for "Number" while inside a "Header".
Can you have nested "Header" elements?
You are opening your output file in one subroutine, and then writing to it and closing it in another. What is the purpose of spliting this up? I would keep the open/write/close together.
And, purely for entertainment purposes, here is my version of your code, with most of the above ideas, reformatted a bit while I was trying to understand it. It compiles but is untested.
#!/usr/bin/perl -w use strict; use warnings; use diagnostics; use XML::Parser; my $plrepository = "."; my @files = <$plrepository/*.xml>; foreach my $xmlfile (@files) { #something is omitted my $p2 = new XML::Parser(Handlers => { Start => \&handle_start, End => \&handle_end, Char => \&handle_char }); $p2->parsefile($xmlfile); } my $current_element; # global, shared with start,char my $Number; # global, shared with start,end,char my $inHeader = 0; # global, shared with start,end,char sub handle_start { my ($pkg,$element,%attr) = @_; $current_element = $element; if ( $element =~ /^Header$/i ) { $Number=$attr{Number}; $inHeader = 1; } } my $separator = ","; my $outputfile = "numbers.txt"; sub handle_end { my ($pkg,$element,%attr) = @_; if ( $element =~ /^Header$/i ) { # Are we overwriting the same file for every Header? open (OUT, ">", $outputfile) or die "No file"; print OUT $Number,"$separator\n"; print "\tNumber ". $Number . "\n"; close (OUT); $inHeader = 0; } } sub handle_char { my ($pkg,$text) = @_; if ( $inHeader && $current_element =~ /^Number$/i && $text !~ /^\s*$/ ) { $Number .= $text; #|-> buffer text } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Another problem with XML parser
by Paulux (Acolyte) on Nov 11, 2009 at 09:07 UTC | |
by Your Mother (Archbishop) on Nov 12, 2009 at 03:42 UTC | |
by Paulux (Acolyte) on Nov 23, 2009 at 11:26 UTC | |
by Anonymous Monk on Nov 16, 2009 at 10:23 UTC | |
by Your Mother (Archbishop) on Nov 17, 2009 at 05:42 UTC | |
by gmargo (Hermit) on Nov 16, 2009 at 13:44 UTC | |
by Paulux (Acolyte) on Nov 17, 2009 at 09:21 UTC | |
| |
by Paulux (Acolyte) on Nov 16, 2009 at 11:54 UTC |