Adding a column to a file where i took out two columns

coolda has asked for the wisdom of the Perl Monks concerning the following question:

Hi guys, I'm a newbie in a programming world, i've been trying to work on this code for last two weeks with no avail. I have thousands of tab delimited files with same format for example

 
file 1 
col0    col1 col2 col3 col4 col5 ... ...
samp1
samp2
samp3
samp4
....

files follow similar format and what i need to do is extract 1st and 4
+th column from the first file and output on a new file and then take 
+out only 4th column from the rest of the files. 

So before i work on my actual project, i wanted to try with more simpl
+e table. The table i'm working on right now is 

S1.txt

col1    col2    col3
1    4    7
2    5    8
3    6    9

S2.txt

col1    col2    col3
1    44    77
2    55    88
3    66    99

The result i'm getting

col1    col3    col3
1    4    
2    5    
3    6

The result i want 

col1    col3    col3
1    4    77
2    5    88
3    6    99
[download]

 
#!/usr/bin/perl -w
use warnings;
use strict;

my @desired_cols = qw(colname1 colname3);
my @desired_cols1= qw(colname3);
my $temp = 'tmp.txt';
# reads first line to get actual column names
open(S1, 'S1.txt') || die "Can't open S1: $!";
open(S2, 'S2.txt') || die "Can't open S2 : $!";
open(OUT, ">$temp") || die "Can't create output : $!"; 
my $header_line = (<S1>);
my $header_line1 = (<S2>);


my @actual_cols = split(/\s+/,$header_line);


# get column number of the actual column names

my $pos  =0;
my %col2_num = map {$_ => $pos++}@actual_cols;

# translate the desired col names into position numbers
                              
my @slice = map{$col2_num{$_}}@desired_cols;               
my @slice1 = map{$col2_num{$_}}@desired_cols1;

print OUT join("\t","@desired_cols"),"\r\n"; #header line
# print colname1 colname3 colname3 in outfile 


while (<S1>, <S2>)
{
  my @row = (split)[@slice];
  my @row1 = (split)[@slice1];
  print OUT join("\t","@row @row1"),"\r\n";       #each data row
  
}
[download]

I think the problem with my code is with the while loop, i thought it would read S1 and S2 line by line but it reads S1 only i think.. please help me out, i'm under a lot of stress :((

Comment on Adding a column to a file where i took out two columns Select or Download Code

Replies are listed 'Best First'.
Re: Adding a column to a file where i took out two columns by GrandFather (Saint) on Sep 24, 2014 at 02:23 UTC
Your problem description, example and code don't seem to be consistent so the following is a guess at what you may be after: #!/usr/bin/perl use warnings; use strict; my $str1 = <<STR; col1 col2 col3 1 4 7 2 5 8 3 6 9 STR my $str2 = <<STR; col1 col2 col3 1 44 77 2 55 88 3 66 99 STR for my $spec ([$str1, qw(col1 col3)], [$str2, qw(col3)]) { my ($file, @wantedCols) = @$spec; print "@wantedCols\n"; open my $fIn, '<', \$file; my $index = 0; my %fileCols = map{$_ => $index++} split /\s+/, <$fIn>; my @slice = map{exists $fileCols{$_} ? $fileCols{$_} : ()} @wanted +Cols; while (<$fIn>) { chomp; print join (' ', (split /\s+/)[@slice]), "\n"; } } [download] Prints: `col1 col3 1 7 2 8 3 9 col3 77 88 99` [download] Note the "string as file" trick used to make the sample self contained and printing to stdout (for the same reason) will need to be changed to suit your real application of course. But using a self contained script like this as a test bed can speed up development and testing a lot because you don't need to juggle multiple files during testing. Note too the three parameter open and the use of lexical file handles. Both things you should get into the habit of using. Perl is the programming world's equivalent of English	[reply] [d/l] [select]
Re^2: Adding a column to a file where i took out two columns by coolda (Novice) on Sep 24, 2014 at 13:50 UTC
thanks for the reply, however the out put i want is : `col1 col3 col3 1 7 77 2 8 88 3 9 99` [download] I want the col3 side by side	[reply] [d/l]
Re^3: Adding a column to a file where i took out two columns by GrandFather (Saint) on Sep 24, 2014 at 21:25 UTC
Ok, that changes things somewhat. How about this: #!/usr/bin/perl use warnings; use strict; my $str1 = <<STR; col1 col2 col3 1 4 7 2 5 8 3 6 9 STR my $str2 = <<STR; col1 col2 col3 1 44 77 2 55 88 3 66 99 STR open my $fIn, '<', \$str1; my $index = 0; my %fileCols = map{+"file1 $_" => $index++} split /\s+/, <$fIn>; my @file1Data; push @file1Data, [split /\s+/] while <$fIn>; close $fIn; my @wantedCols = ('col1', 'col3', 'file1 col3'); open $fIn, '<', \$str2; $index = @{$file1Data[0]}; $fileCols{$_} = $index++ for split /\s+/, <$fIn>; my @slice = map{exists $fileCols{$_} ? $fileCols{$_} : ()} @wantedCols +; print join(' ', map{(split /\s+/)[-1]} @wantedCols), "\n"; while (<$fIn>) { chomp; print join (' ', (@{$file1Data[$. - 2]}, split /\s+/)[@slice]), "\ +n"; } [download] Prints: `col1 col3 col3 1 77 7 2 88 8 3 99 9` [download] Note that effectively all of the first file is read into memory to avoid having to re-read and parse it for each following file. This is fine so long as the first file is less than about half the memory you have available. Perl is the programming world's equivalent of English	[reply] [d/l] [select]
Re: Adding a column to a file where i took out two columns by Tux (Canon) on Sep 25, 2014 at 06:19 UTC
If you are open to "other" viewpoints, I'd say that using DBD::CSV (with `"\t"` as `csv_sep_char`) will make your task very easy using SQL commands. Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]