comment on

Again you have a poor set of test data as when printing file1, all of them are E's now.

File 2 can also be represented as a hash structure. Hash keys are the numeric values and the hash'es value is an array of "chr" strings. This allows more than one chrX value to be associated with a single numeric values. Not sure if that is needed or not, but this code allows that possibility.

#!/usr/bin/perl
use warnings; 
use strict;
use Data::Dumper;

my $file1 = <<END;
chr7    151046672
chr7    151047369
chr3    127680920
chr3    127680920
END

my $file2 = <<END;
chr1    66953622    66953654
chr1    67200451    67200472
chr1    67200475    67200478
chr1    67058869    67058880
chr1    67058881    67058885
chr7    151046672    127680920
chr7    151047369    127680920
chr3    127680920    151046672
chr3    127680920    151047369
END

open my $infile1, '<', \$file1 or die "unable to open first file $!";
open my $infile2, '<', \$file2 or die "unable to open 2nd file $!";

### create memory structure of file 2:
### so that we only have to read file2 once!
#

my %file2_hash;

while (my $line = <$infile2>)
{
   next if $line =~ /^\s*$/;   #skip blank lines (a common infile goof
+)
   
   my ($chr, $value1,$value2) = split /\s+/, $line; # use better "name
+s" I have
                                            # no idea of what a chr co
+l means
   push @{$file2_hash{$value1}},$chr; 
   push @{$file2_hash{$value2}},$chr; 
}
close $infile2;  # file handle closure is optional, but I'd do it.

###  process each line in file1:
###  If a line "matches" with any line in file2, then "E", else "M"
###  I don't know that these numbers mean, come up with better comment
+.

while (my $line = <$infile1>)
{
   chomp $line;  #so that output with E or M can be on same line
   next if $line =~ /^\s*$/;   #skip blank lines (a common infile goof
+)
   
   my ($chr, $val1) = split /\s+/,$line;
   
   if ( grep{$chr}@{$file2_hash{$val1}} )         
   {
      print "$line\tE\n";  # match exists with file 2
   }
   else
   {
      print "$line\tM\n";  # match does NOT exist with file 2
   }
}


__END__
Prints the following:
chr7    151046672    E
chr7    151047369    E
chr3    127680920    E
chr3    127680920    E
[download]

In reply to Re^9: compare two files on the basis of Two IDs by Marshall
in thread compare two files on the basis of Two IDs by genome

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.