in reply to I don't understand why I'm getting an "Use of uninitialized value" error
Here's a simple way, which I've tested on a directory containing a couple of XML files that probably have enough in common with the ones you have. Reading the first 250 lines of the XML::Parser manual was sufficient to know how to write this:
(updated to fix typo in target_tags list)#!/usr/bin/perl use strict; use warnings; use XML::Parser; my ( $tagname, $tagtext ); # "globals" used in callback subs my @target_tags = qw(ID TimeStamp IP_Address Title Complainant Contact Address Email); my $target_regex = join( '|', @target_tags ); my $parser = XML::Parser->new( Handlers => { Start => \&get_tagname, Char => \&get_tagtext, End => \&print_tagdata, } ); for my $xmlfile ( <*.xml> ) { $parser->parsefile( $xmlfile ); } sub get_tagname { $tagname = $_[1]; $tagtext = ''; } sub get_tagtext { $tagtext .= $_[1]; } sub print_tagdata { if ( $_[1] =~ /$target_regex/ ) { print "$_[1] = $tagtext\n"; } }
One notable difference between this and the OP code is that this will print tag labels and their contents in the order in which they occur in the XML files. If that's okay, then there's nothing more to worry about.
(But if you need to control tag order and it varies from one xml file to the next, you just need to add a global hash for storing tag values, then print the hash contents in the desired order after parsing each file. -- update: and don't forget to assign "()" to the hash, i.e. empty it, before parsing each file.)
XML::Parser is the surprisingly simple foundation on which many "higher-level" parsing modules are built. I'm actually surprised at how many CPAN modules have been created that are layers around XML::Parser, considering how easy and efficient this module is.
For relatively simple tasks like yours, the logic involved in using XML::Parser is pretty trivial, and when you use it, you really save a lot of effort, and end up with code that is simpler, more coherent, more robust, and easier to maintain.
|
|---|