I wasn't sure if this would scale to 3+GB but I tested it with a 500MB file (100 vps and 100_000 lines) and it took <1 minute.

#!perl use strict; use warnings; my %head = (); my @vp = (); my %fh = (); my $width; my $t0 = time(); my $infile = '500M.dat'; # read header open IN,'<',$infile or die "could not open $infile : $!"; chomp( my $line1 = <IN> ); my @head = split "\t", $line1; # scan across the columns my $k = 3; # repeat fields for my $c ($k+1..$#head){ my ($vp,$attr) = split '\.',$head[$c]; # open new filehandle for each vp if (not exists $fh{$vp}){ my $outfile = "out_$vp.dat"; open $fh{$vp},'>',$outfile or die "Could not open $outfile : $!"; push @vp,$vp; @{$head{$vp}} = @head[0..$k+1]; print "Opened $outfile for $vp\n"; } else { push @{$head{$vp}},$head[$c]; } ++$width if (@vp < 2) } print "Width = $width\n"; # write headers to outfiles for (keys %fh){ print { $fh{$_} } (join "\t",@{$head{$_}})."\n"; } # process file my $count = 1; while (<IN>){ chomp; my @f = split "\t",$_; my $begin = 4; for my $vp (@vp){ my $end = $begin + $width - 1; #print "$vp $begin $end\n"; print { $fh{$vp} } (join "\t",@f[0..3,$begin..$end])."\n"; # move along to next vp $begin = $begin + $width; } ++$count; } # close out files for (keys %fh){ close $fh{$_}; print "File closed for $_\n"; } my $dur = time - $t0; print "$count lines read from $infile\n"; print scalar @vp." files created in $dur seconds\n";
update : header line corrected to include AVG_Beta
poj

In reply to Re^5: Spliting Table by poj
in thread Spliting Table by tschelineli

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.