in reply to join mutiple files on first column

So, now that the OP has been updated with some code (and PLEASE UPDATE IT AGAIN, to fix the formatting -- <p>...</p> around descriptions and questions, <c>...</c> around code and data)...

I get the impression that you aren't really interested in learning to program in Perl -- so go ahead and just use R, no problem there.

If you actually do want to learn perl, you might try the algorithm I proposed in my earlier reply, instead of copying code that someone else wrote and that you don't understand.

It's not hard to write a working perl script based on a decent pseudo-code description -- you just have to settle a few details, like "where does the list of file names come from?"

Here's an example that assumes the list of file names comes from command line args (which end up in @ARGV) -- that is, you would invoke the script like this:

name_of_script *.txt > all-txt-files.joined
That assumes that your 30 text files are all in the current working directory, and you are able to create a new file in that directory. The following example adds a few extra steps that weren't covered in my earlier post:
#!/usr/bin/perl use strict; use warnings; # get the list of file names -- actually, just make sure @ARGV has the +m: # declare a hash for output die "Usage: $0 *.txt > text.joined\n" unless ( @ARGV and -f $ARGV[0] ) +; my %output; # extra step: declare an array to preserve original order of keys in f +irst file my @output_order; # open the first file # while reading each line from the file # get the first column of the line for use as a hash key # assign the line as the value of hash element using that key open( IN, '<', $ARGV[0] ) or die "Can't read $ARGV[0]: $!\n"; while (<IN>) { my ( $key ) = ( /^(\S+)/ ); # extra steps: add key to order array, turn EOL whitespace into tab c +haracter push @output_order, $key; s/\s+$/\t/; $output{$key} = $_; } shift @ARGV; # (extra step: removes the file name that we just handle +d) # for each remaining file # open the file # while reading each line from the file # get the first column of the line for use as a hash key # append the line to the current value of the hash element usin +g that key for my $file ( @ARGV ) { open( IN, '<', $file ) or die "Can't read $file: $!\n"; while (<IN>) { my ( $key ) = ( /^(\S+)/ ); # extra step: turn EOL whitespace into a tab character s/\s+$/\t/; $output{$key} .= $_; } } # for each key in the hash -- using the original ordering # print the value of the hash element using that key for my $key ( @output_order ) { # extra step: convert the final tab character into a line-feed s/\t$/\n/; print $output{$key}; }
Now, if there's anything there you don't understand, you'll need to do some reading, check some tutorials, search through some perl documentation, etc. That way, you're more likely to be able to write a script on your own the next time a task like this comes up. (And you're more likely to be able to fix this one, if/when things don't go the way you expect -- e.g. if some files have different keys than other files, etc).

Replies are listed 'Best First'.
Re^2: join mutiple files on first column
by david_lyon (Sexton) on Apr 08, 2011 at 00:36 UTC
    Thanks so much for your very detailed help, very much grateful. I shall be making use of it from now onwards
    
    Thanks again graff!