... do the work in less time than Perl needs for startup/shutdown overhead. Perl is more flexible and powerful, but that power does come at a cost ...

If that really was the point you were trying to make here, then it probably would have been better if you'd benchmarked and shown a solution that's actually faster than Perl. On a longer input file (OP never specified file length, but the fact that the number of columns grew from 3 to 20 is a hint), this pure Perl solution I whipped up is twice as fast as the awk code you showed:

use warnings; use strict; my @cols = split /\t/, <>; chomp($cols[-1]); my @fh = map { open my $fh, '>', $_ or die $!; $fh } @cols; while ( my $line = <> ) { chomp($line); my @row = split /\t/, $line; print {$fh[$_]} $row[$_], "\n" for 0..$#row; }
#!/usr/bin/env perl use warnings; use strict; use FindBin; use File::Spec::Functions qw/catfile/; use File::Temp qw/tempfile tempdir/; use IPC::System::Simple qw/systemx/; my $COLS = 20; my $ROWS = 1_000_000; my $AWKSCRIPT = catfile($FindBin::Bin,'11121118.awk'); my $PERLSCRIPT = catfile($FindBin::Bin,'example.pl'); my $expdir = tempdir(CLEANUP=>1); my ($tmpinfh, $infn) = tempfile(UNLINK=>1); { warn "Generating data...\n"; chdir $expdir or die $!; my $c = 'a'; my @cols = map { $c++ } 1..$COLS; print $tmpinfh join("\t", @cols), "\n"; my %fh; open $fh{$_}, '>', $_ or die $! for @cols; for ( 1..$ROWS ) { my @row = map { int rand 1000 } 1..$COLS; print $tmpinfh join("\t", @row), "\n"; print {$fh{$cols[$_]}} $row[$_] ,"\n" for 0..$COLS-1; } close $fh{$_} for @cols; close $tmpinfh; } { warn "Running awk...\n"; my $workdir = tempdir(CLEANUP=>1); chdir $workdir or die $!; systemx('/usr/bin/time', 'awk', '-f', $AWKSCRIPT, $infn); systemx('diff','-rq',$expdir,$workdir); } { warn "Running perl...\n"; my $workdir = tempdir(CLEANUP=>1); chdir $workdir or die $!; systemx('/usr/bin/time','perl',$PERLSCRIPT,$infn); systemx('diff','-rq',$expdir,$workdir); }
I firmly believe that every Perl programmer should learn Awk because learning Awk will make you a better Perl programmer.

Sure, in general, the more programming languages a programmer is exposed to, the better they (usually) become. And yet, there are other situations:

Some time ago I suggested to another questioner to either use sed in his shell script ...

And I once showed someone who was writing an installer shell script how to use a oneliner to do a search and replace to change a configuration variable. And what happened? As the installation script grew, the oneliner just got called over and over again for different variables. While you, I, and the OP may know there are better solutions (as you said yourself, "rewrite the entire script in Perl"), these posts are public and may be read by people who may not know better, and in particular in comparison to awk, I disagree with an unqualified "Sometimes Perl is not the best tool for the job."

Update - I also wanted to mention: In environments where there are several programmers on a team, most of whom are only focused on one language, having a product consist of code written in several different languages is more likely to cause maintenance problems. These are the reasons I said "throwing yet another new language into the mix" isn't necessarily a good thing. (Also, just in case there's any confusion with non-native speakers, the definition of "unqualified" I was using is "not modified or restricted by reservations", as in an "unqualified statement", and not "not having requisite qualifications", as in an "unqualified person".)


In reply to Re^6: Split tab-separated file into separate files, based on column name (open on demand) (updated) by haukex
in thread Split tab-separated file into separate files, based on column name by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.