in reply to Re: Searching Huge files
in thread Searching Huge files

Hi GrandFather, can you please explain in more detail about the flow of program as i come from biology background and its my first Perl program.This logic is very helpful to me as it can be used number of time in my work, so i want to know about it rather than copying code and also please guide me in populating hash from a file , searching through a file. Thanks a lot.

Replies are listed 'Best First'.
Re^4: Searching Huge files
by GrandFather (Saint) on Jul 08, 2008 at 04:46 UTC

    open my $snpIn, '<', \$snpFile uses a variable as though it were a file. It's a useful trick for test code because you don't need a separate file. Simply replace the \$snpFile bit with the file name you would normally use with the open.

    The code that populates the hash uses a couple of tricks so that it is compact. Expanded it might look like:

    while (<$snpIn>) { next unless /^(\w+)\s+(\d+\.\d+)/; $snpLookup{$1} = $2; }

    Note that the original code used while as a statement modifier and the code above uses unless as a statement modifier. Note too that the value given on the input line is the value associated with the key (the first 'word' on the line). You could instead assign $. which would give the line number, or you could ++$snpLookup{$1} instead which would give a count of the number of entries for that 'word' in the file.

    In like fashion the search loop can be expanded:

    while (<$mapIn>) { next unless /^(\w+)\s+(\w+)/ and exists $snpLookup{$1}; print "$1 $2\n"; }

    The important test is exists $snpLookup{$1} which tests to see if the first 'word' on the line was also a first 'word' in the first file using exists. The test is only made if the regular expression succeeds. Using the regular expression in that way avoids possible nastiness at the end of the file and maybe where the file format is not as you expect. See perlretut and perlre for more about regular expressions.


    Perl is environmentally friendly - it saves trees
      Thank you very much GrandFather, for taking out your time for me and writing such a detail explanation.

      Hello GrandFather , i have a new problem now , i need the score from the snp file(first file), now my output should something like this.

      rs7837688 NP_817124 9.85374546064131 rs10499549 ZMYND11 10.4656064706897 rs3749375 ZMYND11 11.7268615355335
      I'm confused can you help me please. Thank you in advance.

        Time for you to reread what has been presented already and engage your brain to solve the problem. Strong hint: $snpLookup{$1} = $2; puts the value you want into the %snpLookup hash and in print "$1    $2\n"; $1 contains the key.


        Perl is environmentally friendly - it saves trees