Some strange application shipped created a bunch of tab-delimited files that had the column headings at the start of *every line*, instead of just having them as the first line/row in the file. Large, nasty files too (80-150 MB each). Perl to the rescue!
#!/usr/bin/perl use strict; my $file = shift or die "Need a filename!\n"; my $columns = shift or die "How many columns?\n"; my $newfile = "$file.new"; open(F, "$file") or die "Could not open $file: $!\n"; open(G, ">$newfile") or die "Could not create $newfile: $!\n"; ## Set the first row as column headings: print G join("\t", (split(/\t/,<F>,$columns+1))[0..$columns-1]), "\n"; ## Rewind for the first line's data seek(F,0,0); ## Now grab only the data from the rest: print G (split(/\t/,$_,$columns+1))[$columns] while <F>; close(F); close(G);
It took Perl about 40 seconds for a 100M file. I really love this language! :) It's so good at solving quick little problems like this (that might not have been so little in other languages...)
In reply to Nobody splits like Perl by turnstep
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |