in reply to Help with function, count matches and store in hash & retrieve sorted

to make my input files I used the following code:

sub get_index(){ my $file = shift; my $indexfile; open FILE,$file; while(<FILE>){ chomp; my @array = split /\,/,$_; if($#array){ $indexfile->{$array[1]} = $array[0]; } } close FILE; return $indexfile;

on a .txt file that looks like this :
1,TCGCCTTA
2,CTAGTACG
3,TTCTGCCT
4,GCTCAGGA
5,AGCGTAGC
6,CATGCCTA
7,GTAGAGAG
8,CCTCTCTG
9,TAGATCGC
10,CTCTCTAT
11,TATCCTCT
12,AGAGTAGA
13,GCGTAAGA
14,ACTGCATA
15,AAGGAGTA
16,CTAAGCCT
along with input files that are added as I iterate over a while loop. These files look like:
p1='CTAAGCCT'
I suppose its possible that p1 is occasionally an empty string, I am not sure how I would print my entire list of query strings.
please let me know if you need more information.

  • Comment on Re: Help with function, count matches and store in hash & retrieve sorted
  • Download Code

Replies are listed 'Best First'.
Re^2: Help with function, count matches and store in hash & retrieve sorted
by Marshall (Canon) on Jul 03, 2017 at 01:49 UTC
    Ok, what would be the most helpful is just the data structures that are used in the call to match_query($indexfile,$queryseq); and which re-produce the problem.

    I guess?:

    my $indexfile = { 'GCTCAGGA' => '4', 'AGCGTAGC' => '5', 'TTCTGCCT' => '3', 'CATGCCTA' => '6', 'TCGCCTTA' => '1', 'GTAGAGAG' => '7', 'GCGTAAGA' => '13', 'TAGATCGC' => '9', 'CTAGTACG' => '2', 'CTAAGCCT' => '16', 'CTCTCTAT' => '10', 'ACTGCATA' => '14', 'AGAGTAGA' => '12', 'TATCCTCT' => '11', 'AAGGAGTA' => '15', 'CCTCTCTG' => '8' };
    What is the $queryseq that goes with that table?
    What the heck does: "along with input files that are added as I iterate over a while loop" mean?

    You are asking a question about an error that you are getting in a subroutine. In the best case, you provide a complete set of runnable code. All we have to do is download and hit the "run" button to replicate exactly your problem.

    What we have now is kind of like a UFO report. If the problem can be reproduced (seen) by all, then there will be solutions forthcoming. Your job is to boil this down to a single set of inputs that "demo's the problem". If the $indexfile structure above doesn't need 16 entries to demo the problem, then use fewer entries. Sometimes submitting an excellent "bug" report requires a lot of work to get the situation down to a minimal, easily replicate-able situation. I have certainly spent entire work weeks doing that for complex issues.

Re^2: Help with function, count matches and store in hash & retrieve sorted
by huck (Prior) on Jul 03, 2017 at 02:03 UTC

    I am not sure how I would print my entire list of query strings.

    How about just when there is a problem

    ... my $max_count = shift @array; unless (defined ($max_count)) { print "queryseq = |||$queryseq|||\n"; print Dumper($indexfile); exit; } my $element; $element->{bar} = $key; $element->{max} = $max_count; push @match_count, $element; ....

Re^2: Help with function, count matches and store in hash & retrieve sorted
by AnomalousMonk (Archbishop) on Jul 03, 2017 at 12:51 UTC
    if($#array){ $indexfile->{$array[1]} = $array[0]; }

    There may be a problem here. The statement
        $indexfile->{$array[1]} = $array[0];
    will be executed if the  @array array is empty ($#array == -1; -1 is true), or if the  @array array has two or more elements ($#array > 0). Is this what you want? (See discussion of the  $# sigil (e.g., $#array) in perldata.)


    Give a man a fish:  <%-{-{-{-<