in reply to Re^3: Searching Huge files
in thread Searching Huge files

open my $snpIn, '<', \$snpFile uses a variable as though it were a file. It's a useful trick for test code because you don't need a separate file. Simply replace the \$snpFile bit with the file name you would normally use with the open.

The code that populates the hash uses a couple of tricks so that it is compact. Expanded it might look like:

while (<$snpIn>) { next unless /^(\w+)\s+(\d+\.\d+)/; $snpLookup{$1} = $2; }

Note that the original code used while as a statement modifier and the code above uses unless as a statement modifier. Note too that the value given on the input line is the value associated with the key (the first 'word' on the line). You could instead assign $. which would give the line number, or you could ++$snpLookup{$1} instead which would give a count of the number of entries for that 'word' in the file.

In like fashion the search loop can be expanded:

while (<$mapIn>) { next unless /^(\w+)\s+(\w+)/ and exists $snpLookup{$1}; print "$1 $2\n"; }

The important test is exists $snpLookup{$1} which tests to see if the first 'word' on the line was also a first 'word' in the first file using exists. The test is only made if the regular expression succeeds. Using the regular expression in that way avoids possible nastiness at the end of the file and maybe where the file format is not as you expect. See perlretut and perlre for more about regular expressions.


Perl is environmentally friendly - it saves trees

Replies are listed 'Best First'.
Re^5: Searching Huge files
by biomonk (Acolyte) on Jul 08, 2008 at 13:06 UTC
    Thank you very much GrandFather, for taking out your time for me and writing such a detail explanation.
Re^5: Searching Huge files
by biomonk (Acolyte) on Jul 09, 2008 at 20:36 UTC

    Hello GrandFather , i have a new problem now , i need the score from the snp file(first file), now my output should something like this.

    rs7837688 NP_817124 9.85374546064131 rs10499549 ZMYND11 10.4656064706897 rs3749375 ZMYND11 11.7268615355335
    I'm confused can you help me please. Thank you in advance.

      Time for you to reread what has been presented already and engage your brain to solve the problem. Strong hint: $snpLookup{$1} = $2; puts the value you want into the %snpLookup hash and in print "$1    $2\n"; $1 contains the key.


      Perl is environmentally friendly - it saves trees

        hi GrandFather, thanks for your hint, i just did this print "$2 $1 $snpLookup{$1}\n"; is it correct way to do it ? thanks a lot for your help.