regan has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to get all object oriented with my latest 'little' perl script. to do this, I am parsing some xml, looking for different object types. When I find the object name, I instantiate the object and in the constructor, I look for the various tags in each object, and then store them in the $self array hash. So far, so good. Here is a sample object:
<ObjectType> <AppObject>hello</AppObject> <AppObjectField>gender</AppObjectField> <valueTargetPair value="MALE" targetPo="Incoming 1" /> <valueTargetPair value="FEMALE" targetPo="Incoming 2" /> </ObjectType>
I do things like: ...
}elsif ($_ =~ /<AppObjectField>(.*)<\/AppObjectField>/) { $self->{appObjFld} = $1; }
... no problem. my problem is the more complicated <valueTargetPair> tag. I parse the values for the value and target labels into $1 and $2. I want to put these two things into a hash with two entries, having value and targetPo as the keys. Now, because I have several valueTargetPair tags, I need to put this has into an array! I do all of that like this:
if ($inline =~ /<valueTargetPair (.*)\/>/) { if ($inline =~ /value=\"(.*?)\" targetPo=\"(.*?)\"/) { my %theHash = {value=>$1, targetPo=>$2}; push (@arry, \%theHash); } }
then, to keep it in my $self object, I do something like:
$self->{valuePair} = \@arry;
To summarize, I parse some xml (not really important that it is xml), and put some of the results into a hash. I store the hashes in an array, which I then need to store in another hash! This leaves me with a hash holding an array holding hashes. I think that the above is all okish My problem, is that later on, I run a method to print out the contents. pulling the simple values (like $self->{appObjFld} ) out of the $self hash is easy. getting that value Pair stuff out is murder. How can I pull out the original %theHash, so that I can print each valueTargetPair?


...regan

Replies are listed 'Best First'.
Re: references, hashes, and arrays.. oh MY!!
by dragonchild (Archbishop) on Aug 05, 2003 at 20:39 UTC
    This may be a stupid question, but if you're parsing XML, why aren't you using XML::Parser? That provides you with a rather nice OO interface ...

    ------
    We are the carpenters and bricklayers of the Information Age.

    The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: HoAoH
by tadman (Prior) on Aug 05, 2003 at 20:43 UTC
    Careful with your dot-star usage. This might run out of control and grab everything between the first open tag and the last close tag, regardless of all the opens and closes in the middle. Using .*? is a bit safer, but still, you should be using one of the many, many XML parsers, such as XML::DOM, XML::SAX and the like.

    As for your hash problem, just reference the data in your array of hashes directly:
    foreach my $entry (@{$self->{valuePair}}) { print "value=", $entry->{value}, $/; print "targetPo=", $entry->{targetPo}, $/; }
    You can even index without iterating, such as $self-{valuePair}[0]{value} and so forth.
      Thanks for the advice about using parsers. I'll look into them once I've got this problem fixed. The xml is output from another program I wrote a while ago, so I understand what should be there. As for my current problem, when I run code, the Perl interpreter points to the first print line you had, and says: Not a HASH reference ... I went here with the debugger, and saw this
      x $self 0 POSwitch=HASH(0x1bc4c8) 'Id' => 'tp_incoming.vsd.10.Switch' 'Object' => 'Mailbox.gender' 'poName' => 'POSwitch' 'valuePair' => ARRAY(0x1a7b94) 0 SCALAR(0xc79d8) -> HASH(0xc7984) 'targetPo' => 'ERROR000' 'value' => 'xdefault' 1 SCALAR(0xc7a5c) -> HASH(0xc79e4) 'targetPo' => 'tp_incoming.vsd.10.Menu' 'value' => 'Mailbox.GENDER_MALE' 2 SCALAR(0xc7a8c) -> HASH(0xc7a68) 'targetPo' => 'tp_incoming.vsd.10.Menu.9' 'value' => 'Mailbox.GENDER_FEMALE'
      and
      DB<4> x $entry 0 SCALAR(0xc79d8) -> HASH(0xc7984) 'targetPo' => 'ERROR000' 'value' => 'xdefault'
      It's almost, but not quite there! ...regan
        Trying to parse XML yourself isn't only non-productive, it's boring and tedious. Creating on the fly objects, however, is fun and exciting. Why bother with boring and tedious when you jump to fun stuff?

        For example, let's run your sample XML through XML::Simple. I'll use the built-in DATA filehandle, so if you run this be sure you include it:

        use XML::Simple; use Data::Dumper; my $xml = XMLin(\*DATA); print Dumper $xml; __DATA__ <ObjectType> <AppObject>hello</AppObject> <AppObjectField>gender</AppObjectField> <valueTargetPair value="MALE" targetPo="Incoming 1" /> <valueTargetPair value="FEMALE" targetPo="Incoming 2" /> </ObjectType>
        When run on a computer that has XML::Simple installed, you should see something like:
        $VAR1 = { 'AppObject' => 'hello', 'valueTargetPair' => [ { 'value' => 'MALE', 'targetPo' => 'Incoming 1' }, { 'value' => 'FEMALE', 'targetPo' => 'Incoming 2' } ], 'AppObjectField' => 'gender' };
        Now ... let's turn it into an object:
        my $xml = XMLin(\*DATA); bless $xml, $xml->{AppObject}; warn unless ref $xml eq 'hello';
        Done. ;) But don't be fooled. $xml is still a reference to an anonymous hash reference, it just also happens to "BE A" 'hello' object. Now, let's run tadman's code on this (with a couple of modifications), but this time i won't bother blessing it, since there is no reason to for this simple example:
        my $xml = XMLin(\*DATA); foreach my $entry (@{$xml->{valueTargetPair}}) { print "value=", $entry->{value}, $/; print "targetPo=", $entry->{targetPo}, $/; }
        This prints:
        value=MALE targetPo=Incoming 1 value=FEMALE targetPo=Incoming 2
        Hope this convinces you to stick with parsers for parsing XML. The work has not only already been done, it's been tried and tested. ;)

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)
        
Re: hashes, arrays, and references... oh my
by graff (Chancellor) on Aug 06, 2003 at 01:46 UTC
    So you're reading xml data that is created by some other process that you wrote, and you know what to expect. XML parsers are good, and I commend their use, but in your case, I wouldn't fault the use of regexes to parse it.

    To get back to your original question, which is a data structure problem, a regex solution sort of like the following ought to work (and of course, filling the data structure via an official XML parser module should work in a similar way). I put this into a simple-minded harness to test it out -- you'll need to complicate it a bit, I'm sure.

    #!/usr/bin/perl use strict; use Data::Dumper; my %hash; my $self = \%hash; while (<DATA>) { if ( m{<(\w+)>([^<]+)</\1>} ) { # a content element $$self{$1} = $2; # store the content keyed by tag name } elsif ( m{<(\w+) ([^>]+)/>} ) { # an "empty" element my ($elem,$attr) = ($1,$2); # get tag name and attributes my %attrhash; while ( $attr =~ s/(\w+)="([^"]+)"// ) { $attrhash{$1} = $2; } push @{$$self{$elem}}, \%attrhash; # expect multiple instances } } print Dumper( $self ); # here's one way to access the structure's contents: for my $tag ( sort keys %$self ) { if ( ref( $$self{$tag} ) eq 'ARRAY' ) { for ( @{$$self{$tag}} ) { # $_ is now an array element contain +ing a hash ref for my $attr ( sort keys %{$_} ) { print "$tag : $attr = $$_{$attr}\n"; } } } else { # the tag had a plain string as content print "$tag = $$self{$tag}\n"; } } __DATA__ <ObjectType> <AppObject>hello</AppObject> <AppObjectField>gender</AppObjectField> <valueTargetPair value="MALE" targetPo="Incoming 1" /> <valueTargetPair value="FEMALE" targetPo="Incoming 2" /> </ObjectType>

    You'll want to study up on the "perlref" (and "perlreftut") man pages, if you haven't done that yet.

    update: jeffa's solution is of course the correct one -- having studied that (as I should have done sooner), I conclude that my own proposed code is unnecessarily complicated and obscure. Do be sure to notice that jeffa solved the problem you were having with references.

Re: hashes, arrays, and references... oh my
by eric256 (Parson) on Aug 05, 2003 at 21:23 UTC

    Why push the hashes onto a list? You could just push the values and then refer to the list as a hash (right?) or you could just build a hashref and point $self->{valuePair} to it. If you want to have multiple values per key then just make it a hashref with lists as the values.

    my $theHash; if ($inline =~ /<valueTargetPair (.*)\/>/) { if ($inline =~ /value=\"(.*?)\" targetPo=\"(.*?)\"/) { push @{$theHash->{$1}},$2; } } $self->{valuePair} = $theHash;
    ___________
    Eric Hodges