Caveats:

I am not certain of your file formats. Here are my assumptions:
  1. For FILE A, it is unclear if you have multiple ID strings on one line, or one per line over multiple lines. I'll make this solution work with either.
  2. For FILE B, are your lines "key <whitespace> letters" or "key <newline> letters <newline>", ie key and values on separate lines?

Solution:

Perl hashes are very robust and often a great solution for simple to medium complexity problems. For this solution I'll read all the entries from the first file, parse out the IDs, and insert each ID into a hash. Then we will parse each entry in fileB and check if that ID is in the hash we built while walking fileA. In the case of a match, we print the ID and LETTERS joined with a < tab %gt; character.

With this solution, we look at each line of fileA and fileB exactly once, and we use a hash lookup on IDs which is fast. This reduces our complexity from O(n^2)+ from the previous solution to something closer to O(n log n), possibly close to O(n) if we're lucky with our ID hashing.

The Code

#!/usr/bin/perl use warnings; use strict; # open filea and parse all id strings. # Add id strings as keys to %wanted array. my %wanted; { open my $file, '<', "filea" || die "failed to open filea : $!"; while( <$file>) { chomp; @ids = split( /\s+/, $_); $wanted{ $_ }++ for @ids; } close $file; } #read fileb, parse lines of the form "id <whitespace> letters" #and print lines that match the id strings from filea. { open my $file, '<', 'fileb' || die "failed to open fileb : $!"; while (<$file>) { chomp; my ($id, $letters) = split( /\s+/, $_); print "$id\t$letters\n" if $wanted{$id}; } } #OR #read fileb, parse lines of the form "id <newline> letters" #and print lines that match the id strings from filea. { open my $file, '<', 'fileb' || die "failed to open fileb : $!"; while (<$file>) { my $id = $_; my $letters = <$file>; chomp($id); chomp($letters); print "$id\t$letters\n" if $wanted{$id}; } } __END__ FileA: 1DWK 2RFK 4ERH FileB: 1DWK HRSDKKDAHJKLSDLDLLJDGHDFJJE 4ERH DFSKFHADFSBVHFWIHFWJBFS 2RFK DADUHRQWERKBNJAIJDLAJDKAKDNAKDJKSADJKAHDJASHRWEUB FileB (alternate): 1DWK HRSDKKDAHJKLSDLDLLJDGHDFJJE 4ERH DFSKFHADFSBVHFWIHFWJBFS 2RFK DADUHRQWERKBNJAIJDLAJDKAKDNAKDJKSADJKAHDJASHRWEUB

In reply to Re: Matching elements in two arrays and printing the element next to the match. by spazm
in thread Matching elements in two arrays and printing the element next to the match. by goingcrazy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.