wfsp has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to parse
;p;one ;greybox_start; ;h2;two ;greybox_end; ;p;three
into
$VAR1 = [ ['div',{'id' => 'article'}, ['p','one'], ['div',{'class' => 'greybox'}, ['h2','two'], ], ['p','three'] ] ];
The code produces
$VAR1 = [ ['div',{'id' => 'article'}, ['p','one'], ['div',{'class' => 'greybox'}], ['h2','two'], ['p','three'] ] ];
(whitespace trimmed)

Clearly, my attempt at keeping track of $depth isn't working

Any ideas on how this might be put right (or take a different approach altogether)?

#!/usr/local/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Indent = 1; my @lines = <DATA>; chomp @lines; my @list = ([q{div}, {id => q{article}}]); my (@current_list); my $depth = 0; for my $line (@lines){ my ($tag, $txt) = $line =~ /^;([^;]+);(.*)/; if ($tag eq q{greybox_start}){ push @{$list[$depth]}, @current_list; push @{$list[$depth]}, [q{div}, {class => q{greybox}}]; $depth++; @current_list = (); next; } elsif ($tag eq q{greybox_end}){ $depth--; push @{$list[$depth]}, @current_list; @current_list = (); next; } push @current_list, [$tag, $txt]; } push @{$list[$depth]}, @current_list; print Dumper(\@list), __DATA__ ;p;one ;greybox_start; ;h2;two ;greybox_end; ;p;three

Replies are listed 'Best First'.
Re: Generating a list of lists (for HTML::Element) from trivial markup
by jethro (Monsignor) on Jul 27, 2008 at 13:18 UTC
    This problem is predestined for a recursive function, even though you don't seem to expect a depth of more than one.
    #!/usr/bin/perl -w use strict; use warnings; use Data::Dumper; $Data::Dumper::Indent = 1; my @lines = <DATA>; chomp @lines; my @list = [q{div}, {id => q{article}},recorddiv()]; sub recorddiv { my @data; while( my $line= shift @lines ) { my ($tag, $txt) = $line =~ /^;([^;]+);(.*)/; if ($tag eq q{greybox_start}){ push @data, [q{div}, {class => q{greybox}}, recorddiv() ]; next; } elsif ($tag eq q{greybox_end}){ return @data; } push @data, [$tag, $txt]; } return @data; } print Dumper(\@list), __DATA__ ;p;one ;greybox_start; ;h2;two ;greybox_end; ;p;three
      Very nicely done!

      Mike
Re: Generating a list of lists (for HTML::Element) from trivial markup
by RMGir (Prior) on Jul 27, 2008 at 13:05 UTC
    I found your approach with a depth variable a bit hard to follow.

    I find it's easier to think in terms of "what list am I adding into currently?", and "what's the stack of lists I was adding to?". That 2nd question handles the possibility that start/end blocks are nested.

    So here's what I came up with (but see jethro's answer below, his is much cleaner):

    #!/usr/local/bin/perl use strict; use warnings; use Data::Dumper; $Data::Dumper::Indent = 1; my @lines = <DATA>; chomp @lines; my @list = ([q{div}, {id => q{article}}]); my $currentList_r = \@{$list[0]}; my $currentDiv=""; my @listStack; LINE: for my $line (@lines){ chomp $line; if($line=~/^;([^;]+)_start;$/) { $currentDiv=$1; my $divList_r=['div',{class=>$1}]; push @listStack, $currentList_r; $currentList_r=$divList_r; next LINE; } if($line=~/^;([^;]+)_end;$/) { die "Tag mismatch between start ($currentDiv) and end ($1)" unless $1 eq $currentDiv; $currentDiv=""; die "end with no matching start!" unless scalar @listStack; push @{$listStack[-1]}, $currentList_r; $currentList_r = pop @listStack; next LINE; } my ($tag, $txt) = $line =~ /^;([^;]+);(.*)/; push @{$currentList_r}, [$tag, $txt]; } print Dumper(\@list), __DATA__ ;p;one ;greybox_start; ;h2;two ;greybox_end; ;p;three

    Mike
Re: Generating a list of lists (for HTML::Element) from trivial markup
by alexm (Chaplain) on Jul 27, 2008 at 23:25 UTC

    You're always pushing new elements on the same array in different positions, not different depths:

    push @{$list[$depth]}, @current_list;

    For instance, in order to h2 being a child of greybox you should:

    push @{$list[0][3]}, @current_list;

    Where depth means there's 2 array references involved. That's why the recursive solution provied by jethro works better: recursion and depth are intimate concepts.