in reply to XML file won't parse properly

I have a file that I need to parse using XML::Parser. Only when I get halfway through the file, the parsing stops with an error...I find that there are weird characters embedded in the text.

First you may want to include a link to your file, so we know what you're talking about.

Second you may want to define "weird characters", so we know what you're talking about.

Third you may want to include your code so, (you guessed it) we know what you're talking about....

And finally, you may want to read: Mirod's review of XML::Parser, I personally found this very helpful.



Wait! This isn't a Parachute, this is a Backpack!

Replies are listed 'Best First'.
Re: Re: XML file won't parse properly
by brpsss (Sexton) on Apr 12, 2001 at 21:38 UTC

    gregor42, thanks.. I guess it seemed obvious to me.. but probably not to anyone else..

    First of all, the code is just a modification of the XML::Parser example code.

    #!/usr/bin/perl -w use strict; use XML::Parser; # initialize hash that will hold header info my $parser = new XML::Parser(ErrorContext => 4,Handlers => {Start => \ +&handle_start, End => \&handle_end, Char => \&handle_char}); my $counter =0; my @tagdesc; my %tags; # parse the file whose name we specified as a command-line parameter $parser->parsefile(shift); open(OUTPUT, ">tag.desc") or die "No open"; foreach my $keyval(keys %tags) { print OUTPUT $keyval, "\n"; } close OUTPUT; sub handle_start { my $p = shift; my $el = shift; my %attribs = @_; if($el eq 'product_data') { $counter ++; } if($counter) { push(@tagdesc, $el); } } sub handle_char { my ($p, $data) = @_; # print $data,"\n" if $counter; } sub handle_end { my $p = shift; my $el = shift; my %atrribs = @_; my $not_written = 0; if($el eq 'product_data') { $counter --; $not_written = 1;} if($not_written) { my $str = join(':',@tagdesc); @tagdesc = (); if(exists $tags{$str}) { my $cnt = $tags{$str}; $cnt++; $tags{ +$str} = $cnt; } else { $tags{$str} = 1; } $str = undef; $not_written = 0; } }

    Unfortunately, the link isn't publically available... and I hope I did an ok job of defining what the weird characters are in the previous post...

    finally, thanks for the XML::Parser review.. I did learn from it.. but unfortunately, not enough to solve this problem :o(.. as another question, where can I find XML::UM or anything to do with Unicode and Perl ?
    thanks for taking the time out to help a complete newbie.. I do appreciate it..