After a long long time fussing, fiddling, tweaking, more than a few total rewrites, and lots of learning (in a big part here), I've finally released my Data::Dumper replacement Data::Dump::Streamer v1.0 hit CPAN last night, and v1.01 with a bunch of doc fixes and enhancements was just uploaded a few minutes ago. (Please refrain from installing v1.0 as the documentation in it is incorrect in a few ways that will probably just annoy you :-)
There always been a lot of talk in the monastery about reinventing wheels. When and when not to, why not to etc. Since Data::Dumper is a pretty serious wheel, I feel I should offer the following justification for reinventing it.
- Streamer dumps structures out in a breadth first fashion. This means that nested data is as shallow as possible, and usually makes self referential data structures much easier to understand. Dumper dumps out in depth first fashion, which produces as deep a structure as possible. While this is fine for computers, for a human reader its very difficult to follow.
- Streamer is designed to emit the structure in pieces to a stream such as a filehandle or any other object that accepts the print() method. Capturing to lists or scalars with this design simply requires a special printing object which preprovided anyway. This means that for large dumps to a file there is no internal memory overhead associated with storing the strigified form. Dumper produces the entire dump in memory, and as such has inherent limitations in how large an object it can dump. On my Win32 box a binary tree of 4096 nodes will overflow all ram with Dumper() maxing out virtual ram at 2gb. Even further, the nature of the Dumper code means that far more memory is required overall when dumping. Streamer actually needs to keep track of less of the data structure, although what it does keep track of is in more detail, so this doesn't become noticable until dealing with very large structures.
-
Streamer handles pathologicial and insane data structures correctly (some details). Its that simple. There are a handful of unusal edge cases that virtually no Dumper or reversable serializer beside Storable, and now Data::Dump::Streamer will handle correctly. This includes bizarre self referential structures, aliasing, readonly-ness, blessed regexes and the like.
-
Streamer allows far control of the serialization of objects and classes, and more ways to control the keys and ordering of hashes than Dumper.
Some people might ask why I didn't focus on making Data::Dumper better and resolving these problem with it. My answer is a mixture of issues with the API of Dumper, the basic mechanism of how it dumps, and the fact that any changes need to be made in both C and Perl. Also, I don't consider Data::Dumper in any way obsoleted by my module. For small to medium size simple structures its hundreds of times faster and perfectly accurate. But if you are dealing with hairy structures, or if you need to read the results of a complex structure, then I think you should try Data::Dump::Streamer out.
Now I have to take the earlier not-so-great version off CPAN. :-)
Much thanks to broquaint for the help on the module over its lifetime, and all of the others who helped test or suggested stuff. Cheers to the Monastery for the being a great place to learn. :-)
---
demerphq
First they ignore you, then they laugh at you, then they fight you, then you win.
-- Gandhi
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.