TStanley has asked for the wisdom of the Perl Monks concerning the following question:

O.K., I FINALLY got all of the data I wanted into an array of arrays of arrays(etc). Now, I need to parse the thing. Since the first couple of arrays inside are merely references to the next, I decided to use a recursive sub to put everything into the hash. Here is the code:
#!/opt/perl5/bin/perl -w use strict; use Parse::RecDescent; use Data::Dumper; use vars qw($grammar); BEGIN{ $::RD_AUTOACTION = q{ [@item[1..$#item]] }; } $grammar = q( file: section(s) section: header pair(s?) header: /\[(\w+)\]/ { $1 } pair: /(\w+)\s?=\s?(\w+)(\s+[\#\;][\w\s]+)?\n/ { if(!defined $3){ [$1,$2]; }else{ [$1,$2,$3]; } } ); my $parser= Parse::RecDescent->new($grammar); my $text; { $/=undef; $text=<DATA>; } my $tree=$parser->file($text); my ($key,$x,%Config); my $config=deparse($tree); print Dumper($config); sub deparse{ my $aoa=shift; for $x (0..$#{$aoa}){ if(ref($aoa->[$x])){ $key=deparse($aoa->[$x]); }else{ if(scalar(@$aoa) == 2){ chomp $aoa->[0]; $y=$aoa->[0]; $Config{$y}=undef; return $key; }elsif(scalar(@$aoa) == 3){ if(defined $aoa->[2]){ $aoa->[2]=~s/^\s+[\;\#]//; chomp $aoa->[2]; $Config{$key}{$aoa->[0]}=[$aoa->[1],$aoa->[2]]; return; }else{ $Config{$key}{$aoa->[0]}=$aoa->[1]; return; } } } } my $return=\%Config; return $return; } __DATA__ [Section1] key1=value1 key2=value2 #Comment2 key3=value3 [Section2] key4=value4 key5=value5 key6=value6 ;Comment 6
It prints out the section headers, but each one of those is undef. Where exactly am I going wrong in the sub?

TStanley
--------

Replies are listed 'Best First'.
Re: How do I set up the recursion?
by sauoq (Abbot) on May 29, 2003 at 20:07 UTC

    Well, you've got several problems with that. I almost don't know where to start... what stood out to me immediately was that you are making your deparse function return different types of things at different times. Here it is supposed to return a key, there it is supposed to return a reference to your file-scoped %Config variable, and over here and here it is supposed to just return; (which is the same as returning undef or, in list context, an empty list.)

    The core issue is that this probably isn't the right place to try to use recursion. Recursion is great where the the data structure itself is recursive. Yours isn't because different levels within the structure contain data that is used for different purposes. Your elements aren't homogenous.

    Just use an iterative approach to parsing it. It is much more straight forward. Here's my quick try at intrepreting your intentions:

    sub deparse { my $tree = shift; my $deparsed = {}; for my $aref (@$tree) { for my $section (@$aref) { my $hash = $deparsed->{$section->[0]} = { }; # $hash is a +shortcut for my $aref (@{$section->[1]}) { $hash->{$aref->[0]} = [ $aref->[1] ]; if (my $comment = $aref->[2]) { $comment =~ s/^\s+[;#]//; push @{$hash->{$aref->[0]}}, $comment; } } } } return $deparsed; }
    Since some of your array refs seem to be useless containers, you might be able to get away with fewer levels of nested fors. If it were my project, I'd almost certainly try to make it a bit shallower...

    -sauoq
    "My two cents aren't worth a dime.";
    
Re: How do I set up the recursion?
by pzbagel (Chaplain) on May 29, 2003 at 20:09 UTC

    Your logic is faulty in the section which checks for 2 array elements. The problem is that you have two sections so that level of the tree happens to have 2 array elements(try print Dumper($tree)). So what happens is that your code logic enters the section name into the hash and set's it as the key with a value of undef:

    if(scalar(@$aoa) == 2){ #<--------- chomp $aoa->[0]; $y=$aoa->[0]; # <-----grabs "Section1" $Config{$y}=undef; # <-------Assigns undef to it. return $key;

    Hence your output. How to fix it. You will need to change your logic around. On top of that, your recursion should really pass around and return references. I've rewritten your code somewhat to traverse the tree correctly, (I made an assumption on what the resulting datastructure you wanted was):

      (I made an assumption on what the resulting datastructure you wanted was)

      He pretty clearly wanted to keep the comments... I think he was going for something like this:

      { 'Section1' => { key1 => [ 'value1', 'Optional comment'] } }

      -sauoq
      "My two cents aren't worth a dime.";
      

      Well, that is easy enough, just change one line in the code I posted:

      $C{$y}=$aoa->[1]; # to $C{$y}=[@$aoa[1..$#$aoa]]; __OUTPUT__ $VAR1 = { 'Section1' => { 'key2' => [ 'value2', ' #Comment2' ], 'key1' => [ 'value1' ], 'key3' => [ 'value3' ] }, 'Section2' => { 'key5' => [ 'value5' ], 'key6' => [ 'value6', ' ;Comment 6' ], 'key4' => [ 'value4' ] } };

      Thanks

      My original idea was to have the value as a scalar, only if it didn't have a comment with it. But upon reflection, it would be much easier on me to put it into an array by itself, for easier access.

      TStanley
      --------
Re: How do I set up the recursion?
by BrowserUk (Patriarch) on May 30, 2003 at 19:53 UTC

    Seems to me that the biggest problem here is that you've written a P::RD grammer, which parses the Config file and produces a data structure to represent it. You then need to hand-roll your own parser to parse the output from P::RD in order to parse that into the format that you really want.

    I realise that this isn't your fault. I spent some time reading the P::RD docs trying to work out how to get it to produce the format you want directly and save the second level of parsing. Whilst it may be possible, I gave up trying after many false starts and no apparent way to control the output format of P::RD.

    This probably isn't what you want, but here's my take on the original problem of parsing this type of file. It use regexes and produces the required output in a single pass. No guarentees that it's complete or bullet proof, but I found this easier (by far) working out how to intervene in the P::RD process.

    # perl -slw use strict; use re 'eval'; use Data::Dumper; use vars qw[%Config]; # A global hash to contain the data my $re_pair = qr{ # key=value [[#;]comment text\n] (?{ our ($key, $value, $comment)=() }) # local vars (\w+) (?{ $key = $^N }) # Capture & save the keyname = (\w+) (?{ $value = $^N }) # Capture & save the value (?: \s+ # obligatory whitespace [#;] # oblig. comment card - either or ([^\n]+) (?{ $comment = $^N }) # Capture & save the comment \n )? # zero or one \s* # If we got here, we have key/name comment?, so save them (?{ $Config{$section}{$key} = [ $value, $comment ] }) }x; my $re_section = qr{ # consist of [label] pairs* (?{ our ($section)=undef }) # Init the section labe +l [[] ([^]]+) []] # Capture the label # Save it. Create a key to hold the pairs (?{ $section = $^N; $Config{$section} = undef }) # zero or more pair sounded by optional while space. \s* $re_pair* \s* }x; my $re_file = qr{ # A file consists of 1 or more sections $re_section+ }x; my $data = do{ local $/; <DATA> } . $/; # slurp the file $data =~ m[$re_file]; # parse it print Dumper \%Config; # Use it __DATA__ [Section1] key1=value1 key2=value2 #comment key3=value3 [Section2] key4=value4 key5=value5 ;Comment2 [Section3]

    Output

    $VAR1 = { 'Section1' => { 'key2' => [ 'value2', 'comment' ], 'key1' => [ 'value1', undef ], 'key3' => [ 'value3', undef ] }, 'Section3' => undef, 'Section2' => { 'key5' => [ 'value5', 'Comment2' ], 'key4' => [ 'value4', undef ] } };

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


Re: How do I set up the recursion?
by Fletch (Bishop) on May 29, 2003 at 18:58 UTC
    Someone set up us the recursion! We get sub.
    What!

    Well, someone was bound to do it eventually . . . .