Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'd like to match column 3 in file1 to column 1 in file 2 and create file3 with column1 from file1 column2 from file1 and column2 from file 2

file 1 sample
SNDK 80004C101 AT XLNX 983919101 BB NETL 64118B100 BS AMD 007903107 CC KLAC 482480100 DC TER 880770102 KATS ATHR 04743P108 KATS RBCN 78112T107 JT TXN 882508104 KATS STM 861012102 KATS
file 2 sample
AT AU AU AU AV AT BB BE BS BR BSE HU BZ BR CC CL CD CZ CG CN

file1 column3 will need to match one value in file2 column1. WHen a match is found, I need to overwite file1 column3 with the corresponding match in file2 column2.

so for the above samples, the output for line1 and line2 of file3 will be:
SNDK 80004C101 AU XLNX 983919101 BE
Any solutions using perl?? Thanks for your help!

Replies are listed 'Best First'.
Re: Matching column in different files to create a third one
by kennethk (Abbot) on Jan 19, 2011 at 16:06 UTC
    With well formatted input, this sort of task is fairly simple. However, as toolic points out, your lack of formatting makes this unnecessarily difficult. If I assume your files are:

    file1.txt:

    SNDK 80004C101 AT XLNX 983919101 BB NETL 64118B100 BS AMD 007903107 CC KLAC 482480100 DC TER 880770102 KATS ATHR 04743P108 KATS RBCN 78112T107 JT TXN 882508104 KATS STM 861012102 KATS

    file2.txt:

    AT AU AU AU AV AT BB BE BS BR BSE HU BZ BR CC CL CD CZ CG CN

    and I assume the mappings of 1st to 2nd column in file 2 are many to one(so I can use a hash), then you'll likely want something like this:

    #!/usr/bin/perl use strict; use warnings; my %mapping; open my $key_handle, '<', 'file2.txt' or die "Open fail: $!"; while (<$key_handle>) { my ($key, $value) = split; $mapping{$key} = $value; } open my $data_handle, '<', 'file1.txt' or die "Open fail: $!"; open my $out_handle, '>', 'file3.txt' or die "Open fail: $!"; while (<$data_handle>) { my @columns = split; print $out_handle "$columns[0]\t$columns[1]\t$mapping{$columns[2]} +\n" }

    However, my guess at your file 1 contains the string 'KATS' multiple times in what I guess was column 3 and never appears in your file 2. Given the distribution of 'KATS' in your specified file 1, this mapping is impossible unless file 1 contains only one line.

Re: Matching column in different files to create a third one
by jethro (Monsignor) on Jan 19, 2011 at 16:15 UTC

    In addition to what toolic said you should realize this isn't a code-writing-for-hire website. We want to help you writing perl, not do all the work for you. UPDATE: As usual the "we" is really "some of us" ;-)

    To get you started, you might check out perldata for information about hashes (good for matching data from one file in another). Just read in the first file, put the column into the hash and check the hash while reading the second file.

Re: Matching column in different files to create a third one
by toolic (Bishop) on Jan 19, 2011 at 15:51 UTC
    You didn't hit the Preview button before you posted. If you had, you would have seen that your post is unformatted, and there is no way to tell which is your column 1 data from your column 2 data, etc.

    Read Writeup Formatting Tips, then repost reply to your node, placing your data inside "code" tags.

    Update: planetscape pointed out that I misspoke.

      Read Writeup Formatting Tips, then repost, placing your data inside "code" tags.

      Unfortunately, there is no way for Anonymous Monk to edit his/her own posts. Reposting will just create a duplicate node, and one will need to be reaped. At this point, the "best" solution is to consider the node for <code> tags, as kennethk has done.

      HTH,

      planetscape
Re: Matching column in different files to create a third one
by JavaFan (Canon) on Jan 19, 2011 at 16:29 UTC
    perl -lape'BEGIN{%$=map/\S+/g,`cat file2`}$F[2]=${$}{$F[2]}||$F[2];$_= +"@F"' file1
    HTH. HAND.