This is part of my continuing quest to grok references in Perl.

I've written a function that pretty-prints a data structure recursively, akin to the behavior of the debugger's x command and the package GraphViz::Data::Grapher. By "data structure", I mean a reference that may contain references to references of hashes of references to arrays of typeglobs (for example).

Here's the script, which contains a test data structure:

#!/usr/bin/perl # lets you examine data recursively, following references. use warnings; use strict; # control behavior: # never go deeper than MAXDEPTH (saves us from circular shenanigans) use constant MAXDEPTH => 16; # cut off long hash keys use constant MAXHASHKEYLEN => 16; # if deferencing through multiple references ($$$$node), collapse the # references? use constant MULTICOLLAPSE => 1; # if deferencing through multiple references ($$$$node), indent # accordingly? use constant MULTIINDENT => 0; # show the indices on array elts? use constant ARRAYIDX => 1; # control the output format: # for references to references: \$var, \\\%hash, etc. use constant DEFERENCE => '->'; # for truncating long or nasty hash keys use constant TRUNCATE => '|..'; # for printing hashes use constant HASHOPEN => '{'; use constant HASHCLOSE => '}'; use constant HASHKEY => '=>'; use constant HASHSEP => ','; # for printing arrays use constant ARRAYOPEN => '['; use constant ARRAYCLOSE => ']'; use constant ARRAYSEP => ','; # for printing array indices use constant ARIDXOPEN => '['; use constant ARIDXCLOSE => ']'; # for printing strings use constant STROPEN => '"'; use constant STRCLOSE => '"'; # for printing undefined scalars use constant UNDEF => 'undef'; # for extra horizontal spacing (recommended ' ') use constant EXTRASPACE => ' '; # for vertical spacing (recommended "\n") use constant NEWLINE => "\n"; # indentation string (set to null if NEWLINE != "\n") use constant INDENT => ' 'x4; my $depth = 0; sub indent () { print INDENT x $depth; } sub newline () { print NEWLINE; } # is it a number? Put it in its own sub to keep warnings on elsewhere sub numberp ($) { no warnings; my $s = shift; return ($s == 0 && $s ne '0'); } # quote non-numbers, hack around "" sub formatscalar ($) { my $str = shift; if (! defined($str)) { # return UNDEF; } elsif (numberp($str)) { # not a number return STROPEN."$str".STRCLOSE; } else { # a number return $str; } } # truncate degenerate hash keys sub formatkey ($) { my $key = shift; if ( # truncate at newline ($key =~ s/\n.*//m) or # truncate at nonprinting chars ($key =~ s/[^[:print:]].*//m) or # truncate if it's too long ($key =~ s/(.{@{[MAXHASHKEYLEN - length TRUNCATE]}}).+/$1/) or # this can only serve to confuse ($key =~ s/@{[HASHKEY]}.*//) ) { # then we truncated. $key .= TRUNCATE; } elsif ( length $key == 0 ) { $key = UNDEF; } return $key; } # recursive, so prototype first sub printnode ($); sub printnode ($) { my $node = shift; my $ref = ref $node; if ($depth > MAXDEPTH) { warn 'Depth exceeded '.MAXDEPTH.', possible circular reference +'; return; } if ($ref eq 'HASH') { print HASHOPEN; newline; for (sort keys %{ $node }) { ++$depth; indent; print formatkey($_),EXTRASPACE.HASHKEY.EXTRASPACE +; printnode(${ $node }{$_}); --$depth; print HASHSEP; newline; } indent; print HASHCLOSE; } elsif ($ref eq 'ARRAY') { print ARRAYOPEN; newline; my $index = 0; for (@{ $node }) { ++$depth; indent; print "[@{[$index++]}]".EXTRASPACE if ARRAYIDX; printnode($_); --$depth; print ARRAYSEP; newline; } indent; print ARRAYCLOSE; } elsif ( $ref eq '' ) { print formatscalar($node); } elsif ( $ref eq 'CODE' or $ref eq 'LVALUE' ) { print $node; } elsif ( $ref eq 'GLOB' ) { print $$node; } elsif ( $ref eq 'REF' or $ref eq 'SCALAR') { if (MULTICOLLAPSE) { my $refdepth=1; while ($ref eq 'REF') { $node = $$node; $ref = ref $node; $refdepth++; } if ($ref eq 'SCALAR') { $node = $$node; } else { $refdepth--; } $depth += $refdepth if MULTIINDENT; print DEFERENCE.(($refdepth>1)?"^$refdepth":'').EXTRASPACE +; newline if MULTIINDENT; indent if MULTIINDENT; printnode($node); $depth -= $refdepth if MULTIINDENT; } else { print DEFERENCE.EXTRASPACE; $depth++ if MULTIINDENT; printnode($$node); $depth-- if MULTIINDENT; } } else { die "ref returned unknown value $ref" } } sub examinedata($) { my $node = shift; printnode($node); newline; } my $variable; my $variable2 = 140; my $hashref; $hashref = { key => $hashref }; sub code () {} examinedata( { hash => $hashref, test => { test => 1, }, test2 => { test3 => 4, test4 => [(1..10)], test5 => \&code, }, test6 => [ \\\\$variable2, \\\\$variable, \\\$variable2, \\\$variable, \\$variable2, \\$variable, \$variable2, \$variable, $variable2, $variable, \\&code, \&code, \\\\\\\[1..2], \\\[1..2], \\[1..2], \[1..2], [1..2], ], test7 => [ "test", \\*GLOB, \*GLOB, *GLOB, ], test8 => \\@{ [ "test", \\*GLOB, \*GLOB, *GLOB, ] }, "This is a long key." => undef, "This\n is a stupid key." => undef, "\n This one's stupider still!" => undef, "ASCII 020\020 = Data Link Escape. WTF?" => undef, "LEET=>HAX" => undef, $variable => \$variable, 100 => \$variable, } );
It works quite well on (relatively) simple data structures such as {array => [1..10], hash => { key => \\\( "string" )}} and so forth. However, I've got some questions:
  1. What about my test structure is emitting the warning "Use of uninitialized value in anonymous hash ({}) at ./examinedata.pl line 205." ??
  2. The documentation for ref (for which my code is essentially a wrapper) mentions a possible return value of LVALUE. When does this happen?
  3. The docs also mention different behavior "If the referenced object has been blessed into a package". (I don't really have a question about this, other than that I don't understand it. Grokking object-oriented Perl is my next quest.)
  4. How can I create a circular reference? If I do, is there a sane way to represent it?
  5. Is there a sexier way to handle code references? In particular, can I get it to print the name of the subroutine (if there is one)?
  6. How can I descend into globs?
  7. Is there a way I can use the debugger to stop at a certain point and execute my code (if, say I prefer to use it over the debugger's x command)?
Thanks!

/usr/bin/perl '-nemap$.%$_||redo,2..$.++;print$.--'

In reply to Data structure examiner by dbw

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.