Lady_Aleena has asked for the wisdom of the Perl Monks concerning the following question:
Hello everyone. Before you start reading, have a Cookie and take a deep breath. Ready?
I am trying to make my data more universal by removing as many HTML tags out of my data and munging it later. For the most part I have succeeded. However, there are a few more hurdles I have yet to clear, and the following is one of them.
I would like to take this following data ...
# * is an unordered list # # is an ordered list # #3 a number indicates the value __DATA__ * list 1 unordered item 1 * list 1 unordered item 2 *# list 1 unordered item 2 ordered item 1 *# list 1 unordered item 2 ordered item 2 *# list 1 unordered item 2 ordered item 3 * list 1 unordered item 3 ** list 1 unordered item unordered item 1 ** list 1 unordered item unordered item 2 ** list 1 unordered item unordered item 3 **# list 1 unordered item unordered item 3 ordered item 1 **# list 1 unordered item unordered item 3 ordered item 2 **# list 1 unordered item unordered item 3 ordered item 3 # list 2 ordered item 1 #3 list 2 ordered item 2 # list 2 ordered item 3 #* list 2 ordered item 3 unordered item 1 #* list 2 ordered item 3 unordered item 2 #* list 2 ordered item 3 unordered item 3
... and feed it into this subroutine, which I got help for here, hopefully between lines 6 and 17 ...
sub story { my ($source, $doc_magic, $line_magic) = @_; my $inc = 0; my @sections; my @toc; while (my $line = <$source>) { chomp($line); next if !$line; if ($line =~ /^2/) { my ($number,$text) = split(/ /,$line,2); push @toc, anchor(textify($text), { href => '#'.idify($text) }); $inc++; } push @{$sections[$inc]}, $line; } my $tab = 3; $inc = 0; for my $section (@sections) { if ($section) { section($tab, sub { $tab++; for my $line (@{$section}) { my $line = convert_string($line, $line_magic); line($tab, $line), next if $line =~ /^</; line($tab, "<$line>"), next if $line =~ /^[bh]r$/; $doc_magic->{$1}->(), next if $line =~ /^&\s+(.*)/; blockquote($tab, $1), next if $line =~ /^bq\s(.*)/; item($tab + 1, $1), next if $line =~ /^\*\s(.*)/; item($tab + 1, $2, { value => $1 }), next if $line =~ /^\*(\ +d+)\s(.*)/; item($tab + 1, "<strong>$1</strong> $2"), next if $line =~ / +^\*s\s(.+\:)\s(.*)$/; heading($tab, $1, $2, { id => idify($2) }), next if $line + =~ /^([1-6])\s+(.*)/; paragraph($tab, $line, { class => 'author' }), next if $line + =~ /^by /; paragraph($tab, $line); } $tab--; }); } if ($inc == 0 && @toc > 3) { section($tab, sub { my $class = @toc > 25 ? @toc > 50 ? 'four' : 'three' : 'two'; my $style = @toc > 50 ? 'font-size:smaller' : undef; list($tab + 1, 'u', \@toc, { class => $class, style => $style +}); }, { class => 'contents'} ); } $inc++; } # paragraph($tab,"written by $root_user", { class => 'author' }); }
... which will hopefully feed the following data structures through lines 22 through 42 above. Take another deep breath, by the way ...
my $list1 = [ 'u', [ 'list 1 unordered item 1', [ 'list 1 unordered item 2', { 'inlist' => [ 'o', [ 'list 1 unordered item 2 ordered item 1', 'list 1 unordered item 2 ordered item 2', 'list 1 unordered item 2 ordered item 3' ] ] } ], [ 'list 1 unordered item 3', { 'inlist' => [ 'u', [ 'list 1 unordered item unordered item 1', 'list 1 unordered item unordered item 2', [ 'list 1 unordered item unordered item 3', { 'inlist' => [ 'o', [ 'list 1 unordered item unordered item 3 ordered item 1', 'list 1 unordered item unordered item 3 ordered item 2', 'list 1 unordered item unordered item 3 ordered item 3' ] ] }, ] ] ] }, ] ] ]; my $list2 = [ 'o', [ 'list 2 ordered item 1', ['list 2 ordered item 2', { value => '3' } ], [ 'list 2 ordered item 3', { inlist => [ 'u', [ 'list 2 ordered item 3 unordered item 1', 'list 2 ordered item 3 unordered item 2', 'list 2 ordered item 3 unordered item 3', ] ] } ] ] ];
... into ...
sub list { my ($tab,$type,$list,$opt) = @_; my $tag = $type.'l'; my $open = open_tag($tag,$opt,[@ics,@java]); line($tab,"<$open>"); for my $item (@$list) { if (ref($item) eq 'ARRAY') { item($tab + 1,$item->[0],$item->[1]); } else { item($tab + 1,$item); } } line($tab,"</$tag>"); }
which is dependent on...
sub item { my ($tab,$value,$opt) = @_; my $tag = 'li'; my $open = open_tag($tag, $opt, ['value', @ics, @java]); line($tab, "<$open>"); line($tab + 1, $value); if ($opt->{inlist}) { list($tab + 1, @{$opt->{inlist}}); } line($tab,"</$tag>"); }
... with instructions on how to use them somewhere in here.
I am tired, cranky, and moody. I can't figure out how to munge the lines. A nudge, a whisper, a gentle turning of the head is all I can ask for here. Please just don't ask me to rewrite my list or item subroutines. I use them elsewhere too.
After lists are handled, I still have the table and some inline HTML tags to remove by some sort of munging. Those are both a lot more complicated.
Thanks in advance!
|
|---|