You're going to have to do two passes if it's a huge file (as in huge = too big for memory). The first pass does most of the work:

The temp file looks something like:

BD3101|bananaswiithapples.gif|\breakfast\fruits\tree\|2007-03-06 14:02 +:31.000000 TP4223|chocolatecaramelfudge.gif|\sweet\desserts\hersheys\|2006-02-28 +21:16:41.000000 EO2123|tofuwithpeas.gif|\organic\vegetables\legumes\|2007-07-16 13:55: +06.000000

The second pass processes this temp file and produces the formatted output:

BD3101 bananaswiithapples.gif \breakfast\fruits\tree\ 2007-03- +06 14:02:31.000000 TP4223 chocolatecaramelfudge.gif \sweet\desserts\hersheys\ 2006-02- +28 21:16:41.000000 EO2123 tofuwithpeas.gif \organic\vegetables\legumes\ 2007-07- +16 13:55:06.000000

The code that follows doesn't use files at at all (I'll leave that to you - it's trivial) and produces the output above:

my ( @in, @out, @temp_file ); my @lengths = (0) x 4; pass1(); pass2(); sub pass1 { while ( <DATA> ) { my @in = unpack "A9A10A9A*", $_; if ( $in[0] ) { write_to_temp( @out ) if $out[0]; @out = @in; next; } $out[$_] .= $in[$_] for 0 .. 3; } write_to_temp( @out ); } sub pass2 { my $format = join " ", ( map "%-${_}s", @lengths ), "\n"; for ( @temp_file ) { chomp; my @f = split /\|/; printf $format, @f; } } sub write_to_temp { s/\s+/ /g, s/^\s+//, s/\s+$// for $_[3]; length $_[$_] > $lengths[$_] and $lengths[$_] = length $_[$_] for 0 .. 3; push @temp_file, join( "|", @_ ) . "\n"; }

PS I've assumed BrowserUk's comment about mistyped sample data to be true.


In reply to Re: MultiLine Tables into Variables by FunkyMonk
in thread MultiLine Tables into Variables by Knoperl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.