Recently I have been routinely committing that most grave crime of reinventing wheels. The wheels are Data::Dumper and Data::Dump.

I can hear the hiss of shock. Dumper!? Why Dumper? Its been a standard module for ever. It was written by G. Sarathy. It's tried and true. (Ditto goes for G. Aas's Dump module)

Well. The first answer is that I dont like their output. Either of them. They do a depth first traversal of their data structures, which while fast and efficient renders cyclic and self referential data structures utterly unrecognizable, and usually completely incomprehensable. What I want (and have written) is a dumper that does a breadth first traversal so that if an object is referenced at multiple places in a data structure I want it to appear in the dump at the highest level it is mentioned, not wherever it gets first encountered in a depth first traversal.

But it turns out that there are even better reasons. Both are buggy.

Data::Dumper while a tried and true workhorse has a number of problems. Under MS it has a memory leak and can not handle large data structures. But even worse it has a very subtle bug with how it outputs references to scalars. This can be seen by the following very simple example
use Data::Dumper; my ($x,$y); $y='Foo'; $x=\$y; print Dumper($x); __END__ #Outputs $VAR1=\'Foo';
Do you see the error? Its not obvious but its serious. The bug is that we can do this $$x='Bar'; but if we take the output of Dumper and try the same thing $$VAR1='Bar'; we get a Modification of a read-only value error.

So the output of Data::Dumper can not be relied on to correctly recreate its input.

And now for Data::Dump. Data::Dump suffers from the same problem with references to scalars as Dumper (hardly suprising as Data::Dump was originally by Sarathy). But it has even more serious problems

use Data::Dump; my ($x,$y); $x=\$y; $y=\$x; print dump([$x,$y]);
will cause Data::Dump to go into an infinite loop, which obviously means that we should be cautious to say the least when using Data::Dump for anything non-trivial.

Let this be a warning to those of you that use Data::Dumper or Data::Dump for persistancy purposes (with Storable or MLDBM for example)

So in the process of reinventing the wheel I discovered that the wheel isn't as good as I thought it was in the first place. I wonder how many other modules this applies to? CGI perhaps? Maybe we shouldnt be so harsh on those who think that a little reinventing the wheel isnt a bad thing. You never know we might end up with better wheels in the long run...

BTW, heres how my dumper (Data::BFDump) and Data::Dumper would handle that last example (its a test case that I call "Scalar Cross"), I know which one I would rather try to figure out, and it shows the difference between the results of a depth first traversal and a breadth first traversal.

#Dumper Output with Purity on $VAR1 = [ \\do{my $o}, do{my $o} ]; ${${$VAR1->[0]}} = $VAR1->[0]; $VAR1->[1] = ${$VAR1->[0]}; #Standard BFDump output do { my $RT_ARRAY = [ \do { my $t }, \do { my $t } ]; ${$RT_ARRAY->[0]} = $RT_ARRAY->[1]; ${$RT_ARRAY->[1]} = $RT_ARRAY->[0]; $RT_ARRAY }
Watch for the initial release of Data::BFDump on your local CPAN mirror over the next few days.

:-)

Yves / DeMerphq
---
Writing a good benchmark isnt as easy as it might look.


In reply to Reinventing the wheel: Dumper Difficulties by demerphq

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.