wufnik has asked for the wisdom of the Perl Monks concerning the following question:

hola perl dudes & dudettes;

obwarning:

this question might interest those who have a mild interest in XML::Dumper or serialization more... than those who don't.

the problem:

in an attempt to serialize certain pesky struct-alikes as fashionably as possible, i turned to module XML::Dumper, which does the job very slinkily indeed. however, i have been puzzled by the behaviour of 2 core functions in this module: xml_compare, and xml_identity, both of which take chunks of xml and, well, inform about "identity in content" of xml and "identity in instantiation".

here is an example of their use:
use XML::Dumper; my $struct = { "ferret" => ["sredni", "vashtar"], "hen" => "anabaptist", }; my $xdumper = new XML::Dumper; # generate xml equiv to struct my $ssxmlified = $xdumper->pl2xml($struct); # write it to file as well $xdumper->pl2xml($struct, "oot.txt"); # reconstitute our struct from the file my $structreconstituted = $xdumper->xml2pl("oot.txt"); # generate xml===reconstituted struct $rexmlified = $xdumper->pl2xml($structreconstituted); my $sscequal = $xdumper->xml_compare($rexmlified, $ssxmlified); my $ssiequal = $xdumper->xml_identity($rexmlified, $ssxmlified); printf "xml_compare: %s\n", ($sscequal? "true" : "false"); printf "xml_identity:%s\n", ($ssiequal? "true" : "false");
although the xml generated is entirely the same in content (*not* memory locations), neither xml_compare nor xml_identity return true. the latter i understand, the former frustrates me, as i was hoping to use it to check whether 2 data structures were equal in content.
does anyone know of any way, slick or otherwise, to achieve this goal?
thanks for any response.

Replies are listed 'Best First'.
Re: XML::Dumper: Nifty or Naughty?
by Jaap (Curate) on May 09, 2003 at 10:46 UTC
    This is the source of the xml_compare sub:
    sub xml_compare { # ============================================================ =item * xml_compare( $xml1, $xml2 ) - Compares xml for content Compares two dumped Perl data structures (that is, compares the xml) f +or identity in content. Use this function rather than perl's built-in str +ing comparison, especially when dealing with perl data that is memory-loca +tion dependent (which pretty much means all references). This function wil +l return true for any two perl data that are either deep clones of each other, or identical. This method is exported by default. =cut # ------------------------------------------------------------ my $xml1 = shift; my $xml2 = shift; $xml1 =~ s/(<[^>]*)\smemory_address="\dx[A-Za-z0-9]+"([^<]*>)/$1$2 +/g; $xml2 =~ s/(<[^>]*)\smemory_address="\dx[A-Za-z0-9]+"([^<]*>)/$1$2 +/g; $xml1 =~ s/(<[^>]*)\sdefined=\"false\"([^<]>)/$1$2/g; # For backwa +rds $xml2 =~ s/(<[^>]*)\sdefined=\"false\"([^<]>)/$1$2/g; # compatibil +ity $xml1 =~ s/<\?xml .*>//; # Ignore XML declaration $xml2 =~ s/<\?xml .*>//; $xml1 =~ s/<\!DOCTYPE perldata \[.*\]>//s; # Remove DTD $xml2 =~ s/<\!DOCTYPE perldata \[.*\]>//s; $xml1 =~ s/^\n//gm; # Remove empty newlines $xml2 =~ s/^\n//gm; return not( $xml1 cmp $xml2 ); }
    What do you see if you compare $rexmlified and $ssxmlified yourself? They must look like simple textual xml don't they? You could try runnung it through the substitutions above and see where they differ.
      Jaap, thanks for your response.
      Here is what my debugger thinks of $ssxmlified. sorry about the name.
      $ssxmlified = "<perldata>\n <hashref memory_address=\"0x1b932f8\">\n <item key=\"ferret\">\n <arrayref memory_address=\"0x1b9f174\">\n <item key=\"0\">sredni</item>\n <item key=\"1\">vashtar</item>\n </arrayref>\n </item>\n <item key=\"hen\">anabaptist</item>\n </hashref>\n</perldata>\n";

      here is $rexmlified
      $rexmlified = "<perldata>\n <hashref memory_address=\"0x21e8434\">\n <item key=\"ferret\">\n <arrayref memory_address=\"0x329ac98\">\n <item key=\"0\">sredni</item>\n <item key=\"1\">vashtar</item>\n </arrayref>\n </item>\n <item key=\"hen\">anabaptist</item>\n </hashref>\n</perldata>\n";

      which look pretty much identical to me, except those memory addresses. i thought xml_compare ignored these. will delve into the sub itself. any further thoughts very welcome.
        XML::Dumper is indeed nifty, though to use xml_compare properly, you should not use the OO form

         $xdumper->xml_compare($this,$that);

        but just plain ol'

         xml_compare($this, $that);

        the functional way.

        no qualifiers necessary. some of xml::dumper's functions are available in both functional and oo forms, but sadly not xml_compare, and i presume xml_identity. perhaps the perldoc should be augmented a teensy bit on this? whatever, this doesn't cast any shadow on the module itself, which gets a big thumbs up from me anyway.

        hmmm, do i get experience points for answering my own question, or are they deducted?
Re: XML::Dumper: Nifty or Naughty?
by bobn (Chaplain) on May 09, 2003 at 16:49 UTC
    From:
    perldoc -q 'How do I test whether two arrays or hashes are +equal?' use FreezeThaw qw(cmpStr cmpStrHard); %a = %b = ( "this" => "that", "extra" => [ "more", "stuff" +] ); $a{EXTRA} = \%b; $b{EXTRA} = \%a; printf "a and b contain %s hashes\n", cmpStr(\%a, \%b) == 0 ? "the same" : "different"; printf "a and b contain %s hashes\n", cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; The first reports that both those the hashes contain the same data, while the second reports that they do not. Which you prefer is left as an exercise to the reader.
    Not exactly what was asked for, comparing the other end of th transform between XML and perl structures.

    Bob Niederman, http://bob-n.com