Hello everyone. Before you start reading, have a Cookie and take a deep breath. Ready?

I am trying to make my data more universal by removing as many HTML tags out of my data and munging it later. For the most part I have succeeded. However, there are a few more hurdles I have yet to clear, and the following is one of them.

I would like to take this following data ...

# * is an unordered list # # is an ordered list # #3 a number indicates the value __DATA__ * list 1 unordered item 1 * list 1 unordered item 2 *# list 1 unordered item 2 ordered item 1 *# list 1 unordered item 2 ordered item 2 *# list 1 unordered item 2 ordered item 3 * list 1 unordered item 3 ** list 1 unordered item unordered item 1 ** list 1 unordered item unordered item 2 ** list 1 unordered item unordered item 3 **# list 1 unordered item unordered item 3 ordered item 1 **# list 1 unordered item unordered item 3 ordered item 2 **# list 1 unordered item unordered item 3 ordered item 3 # list 2 ordered item 1 #3 list 2 ordered item 2 # list 2 ordered item 3 #* list 2 ordered item 3 unordered item 1 #* list 2 ordered item 3 unordered item 2 #* list 2 ordered item 3 unordered item 3

... and feed it into this subroutine, which I got help for here, hopefully between lines 6 and 17 ...

sub story { my ($source, $doc_magic, $line_magic) = @_; my $inc = 0; my @sections; my @toc; while (my $line = <$source>) { chomp($line); next if !$line; if ($line =~ /^2/) { my ($number,$text) = split(/ /,$line,2); push @toc, anchor(textify($text), { href => '#'.idify($text) }); $inc++; } push @{$sections[$inc]}, $line; } my $tab = 3; $inc = 0; for my $section (@sections) { if ($section) { section($tab, sub { $tab++; for my $line (@{$section}) { my $line = convert_string($line, $line_magic); line($tab, $line), next if $line =~ /^</; line($tab, "<$line>"), next if $line =~ /^[bh]r$/; $doc_magic->{$1}->(), next if $line =~ /^&\s+(.*)/; blockquote($tab, $1), next if $line =~ /^bq\s(.*)/; item($tab + 1, $1), next if $line =~ /^\*\s(.*)/; item($tab + 1, $2, { value => $1 }), next if $line =~ /^\*(\ +d+)\s(.*)/; item($tab + 1, "<strong>$1</strong> $2"), next if $line =~ / +^\*s\s(.+\:)\s(.*)$/; heading($tab, $1, $2, { id => idify($2) }), next if $line + =~ /^([1-6])\s+(.*)/; paragraph($tab, $line, { class => 'author' }), next if $line + =~ /^by /; paragraph($tab, $line); } $tab--; }); } if ($inc == 0 && @toc > 3) { section($tab, sub { my $class = @toc > 25 ? @toc > 50 ? 'four' : 'three' : 'two'; my $style = @toc > 50 ? 'font-size:smaller' : undef; list($tab + 1, 'u', \@toc, { class => $class, style => $style +}); }, { class => 'contents'} ); } $inc++; } # paragraph($tab,"written by $root_user", { class => 'author' }); }

... which will hopefully feed the following data structures through lines 22 through 42 above. Take another deep breath, by the way ...

my $list1 = [ 'u', [ 'list 1 unordered item 1', [ 'list 1 unordered item 2', { 'inlist' => [ 'o', [ 'list 1 unordered item 2 ordered item 1', 'list 1 unordered item 2 ordered item 2', 'list 1 unordered item 2 ordered item 3' ] ] } ], [ 'list 1 unordered item 3', { 'inlist' => [ 'u', [ 'list 1 unordered item unordered item 1', 'list 1 unordered item unordered item 2', [ 'list 1 unordered item unordered item 3', { 'inlist' => [ 'o', [ 'list 1 unordered item unordered item 3 ordered item 1', 'list 1 unordered item unordered item 3 ordered item 2', 'list 1 unordered item unordered item 3 ordered item 3' ] ] }, ] ] ] }, ] ] ]; my $list2 = [ 'o', [ 'list 2 ordered item 1', ['list 2 ordered item 2', { value => '3' } ], [ 'list 2 ordered item 3', { inlist => [ 'u', [ 'list 2 ordered item 3 unordered item 1', 'list 2 ordered item 3 unordered item 2', 'list 2 ordered item 3 unordered item 3', ] ] } ] ] ];

... into ...

sub list { my ($tab,$type,$list,$opt) = @_; my $tag = $type.'l'; my $open = open_tag($tag,$opt,[@ics,@java]); line($tab,"<$open>"); for my $item (@$list) { if (ref($item) eq 'ARRAY') { item($tab + 1,$item->[0],$item->[1]); } else { item($tab + 1,$item); } } line($tab,"</$tag>"); }

which is dependent on...

sub item { my ($tab,$value,$opt) = @_; my $tag = 'li'; my $open = open_tag($tag, $opt, ['value', @ics, @java]); line($tab, "<$open>"); line($tab + 1, $value); if ($opt->{inlist}) { list($tab + 1, @{$opt->{inlist}}); } line($tab,"</$tag>"); }

... with instructions on how to use them somewhere in here.

I am tired, cranky, and moody. I can't figure out how to munge the lines. A nudge, a whisper, a gentle turning of the head is all I can ask for here. Please just don't ask me to rewrite my list or item subroutines. I use them elsewhere too.

After lists are handled, I still have the table and some inline HTML tags to remove by some sort of munging. Those are both a lot more complicated.

Thanks in advance!

No matter how hysterical I get, my problems are not time sensitive. So, relax, have a cookie, and a very nice day!
Lady Aleena

In reply to Need help with filling a complicated data structure by Lady_Aleena

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.