comment on

I have tried a lot but did not achieve the results. I have 2 big files (tab delimited). first file ->

Col1           Col2    Col3 Col4     Col5        Col6       Col7    Col8
101_#2          1       H    F0       263        278        2       1.5
102_#1          1       6    F1       766        781        1       1.0
103_#1          2       15   V1       526        581        1       0.0
103_#1          2       9    V2       124        134        1       1.3
104_#1          1       12   V3       137        172        1       1.0
105_#1          1       17   F2       766        771        1       1.0

second file ->

Col1    Col2    Col3             Col4
97486	H	262               279
67486	9	118	          119
87486	9	183	          185
248233	9	124	          134

If col3 value/character (of file1) and col2 value/character (of file 2) are same and then compare col5 and col6 of file 1(like a range value) with col3 and col4 of file2, if range of file 1 is present in file 2 then return that row (from file1) and also add the extra column1 from file2 in output. Expected output ->

Col1      Col2    Col3 Col4     Col5        Col6       Col7    Col8   Col9
101_#2        1       H    F0       263        278        2       1.5       97486
103_#1        2       9    V2       124        134        1       1.3       248233

So far I have tried something with hashes->

   @ARGV or die "No input file specified";
    open my $first, '<',$ARGV[0] or die "Unable to open input file: $!
+";
    open my $second,'<', $ARGV[1] or die "Unable to open input file: $
+!";
    print scalar (<$first>);
    while(<$second>){
    chomp;
    @line=split /\s+/;
    $hash{$line[2]}=$line[3];
    }
    while (<$first>) {
        @cols = split /\s+/;
        $p1 = $cols[4];
        $p2 = $cols[5];
    foreach $key (sort keys %hash){
    
    if ($p1>= "$key"){
    if ($p2<=$hash{$key})
    {
    print join("\t",@cols),"\n";
    }
    }
    else{next;}
    }
    }
[download]

But there is no comparison of col3 value/character (of file1) and col2 value/character (of file 2)in above code. But this is also taking lot of time and memory.Can anybody suggest how I can make it fast using hashes or hashes of hashes.Thanks a lot.

Hello everyone,

Thanks a lot for your help. I figured out an efficient way for my own question.

   @ARGV or die "No input file specified";
    open $first, '<',$ARGV[0] or die "Unable to open input file: $!";
    open $second,'<', $ARGV[1] or die "Unable to open input file: $!";
    print scalar (<$first>);
    
    
    while(<$second>){
    chomp;
    @line=split /\s+/;
    
        $hash{$line[1]}{$line[2]}{$line[3]}= $line[0];
        }
    while (<$first>) {
        
        @cols = split /\s+/;
     
    foreach  $key1 (sort keys %hash) {
       foreach $key2 (sort keys %{$hash{$key1}}) {
            foreach  $key3 (sort keys %{$hash{$key1}{$key2}}) {
            if (($cols[2] eq $key1) && ($cols[4]>=$key2) && ($cols[5]<
+=$key3)){
                print join("\t",@cols),"\t",$hash{$key1}{$key2}{$key3}
+,"\n";
            
            }    
            last;
        }
        
    }
    }
    }
[download]

In reply to Using perl hash for comparing columns of 2 files by vikas.bansal

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.