comment on

Generally nested loops is a code smell. Nesting loops four deep goes beyond stinking to somewhere around putrid! For modest size files - say up to a few hundred megabytes for the smaller of them (the size of the second file doesn't matter) reading the smaller file into a hash and using that as a lookup is the preferred solution. Consider:

use strict;
use warnings;

my $inData1 = <<DATA1;
G_00160 F_02571
G_00161 F_01082
G_00162 F_00034
G_00163 F_00035
G_00164 F_00036
DATA1

my $inData2 = <<DATA2;
F_00013 G_06670
F_00034 G_00162
F_00035 G_00163
F_00036 G_00164
F_00038 G_00165
DATA2
my $outfile;

open my $ur_ci, "<", \$inData1;
my %urCi = map {chomp; split} <$ur_ci>;
close $ur_ci;

my %matches;
open my $ci_ur, "<", \$inData2;
while (<$ci_ur>) {
    chomp;
    my ($ci, $ur) = split;

    $matches{$ci} = $ur if exists $urCi{$ur} && $urCi{$ur} eq $ci;
}

print "$_ => $matches{$_}\n" for sort keys %matches;
[download]

Prints:

F_00034 => G_00162
F_00035 => G_00163
F_00036 => G_00164
[download]

If your input files are both rather larger than would easily fit in memory (more than 1/4 your memory size for the smallest) then you should really consider using a database. If this is a one off task SQLite may be a good choice. Consider:

use strict;
use warnings;
use DBI;

my $inData1 = <<DATA1;
G_00160 F_02571
G_00161 F_01082
G_00162 F_00034
G_00163 F_00035
G_00164 F_00036
DATA1

my $inData2 = <<DATA2;
F_00013 G_06670
F_00034 G_00162
F_00035 G_00163
F_00036 G_00164
F_00038 G_00165
DATA2

unlink 'db.SQLite';

my $dbh = DBI->connect ("dbi:SQLite:dbname=db.SQLite","","");

$dbh->do ('CREATE TABLE urci (ci TEXT, ur TEXT)');
$dbh->do ('CREATE TABLE ciur (ur TEXT, ci TEXT)');

my $sth = $dbh->prepare ('INSERT INTO urci (ur, ci) VALUES (?, ?)');

open my $ur_ci, "<", \$inData1;
$sth->execute (do {chomp; split}) while <$ur_ci>;
close $ur_ci;

$sth = $dbh->prepare ('INSERT INTO ciur (ci, ur) VALUES (?, ?)');

open my $ci_ur, "<", \$inData2;
$sth->execute (do {chomp; split}) while <$ci_ur>;
close $ci_ur;

$sth = $dbh->prepare (
    'SELECT * FROM ciur INNER JOIN urci ON ciur.ci = urci.ci AND ciur.
+ur = urci.ur'
    );
$sth->execute ();

print "$_->{ci} => $_->{ur}\n" while $_ = $sth->fetchrow_hashref ();
[download]

Prints:

F_00034 => G_00162
F_00035 => G_00163
F_00036 => G_00164
[download]

True laziness is hard work

In reply to Re: Hash Comparisions by GrandFather
in thread Hash Comparisions by perl_n00b

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.