in reply to Hash making

Take pity on ikegami. Don't write Perl code to perform this task. Write out how you would solve the problem (step by step) in English. Don't talk about hashes or loops. Just describe how you would solve the problem if you had to do it manually.

That should give you an insight into how to code a solution.

Replies are listed 'Best First'.
Re^2: Hash making
by sesemin (Beadle) on Sep 21, 2008 at 21:14 UTC
    Thanks APL,

    This problem has gone way far off. Let's Start over as you suggested.

    simple questions: If you have a tab delimited file with e.g. 4 columns. How would you read it over and over to extract data with different conditions. Let's focus on col4. If the values range from 0-20. I want to read extract the lines that col4 ==4, save number of lines read (met the condition) somewhere. Then automatically increase it to 5 and see how may lines this time will be extracted, and then add your criteria (this time col4==5) and the number of liens read (just count not the actual lines) to the somewhere that you had for the previous iteration.

    You will end up with a structure like this.

    key(criteria) value (number of lines extracted) 0=>2000 1=>1800 2=>1600 and so on.

    Your thoughts are very appreciated.

      That's so much clearer! The solution is:
      my %counts; while (<$fh>) { chomp; my @fields = split /\t/; $counts{ $fields[3] }++; }

      You can print the results as follows:

      for ( sort { $a <=> $b } keys %counts ) { print("$_: $counts{$_}\n"); }

      By the way, I said it was clearer, but it still not that clear. You still used the word "extract", for starters. It appears to mean "count" in this case. You could have said "Count how many times each different value occurs in the 4th column", but you decided to talk about how to do it (code) instead of of what you want (data).

      You don't need a hash for that; a simple array will do.

      Assume you want to check if column 4 is equal to 4, 5, .. N

      • For each line in a file
        • Split up the line into its component fields
        • For each $index in the range 4 through N inclusive
          • Increment $count [$index ]

      You can print $count out at the end. If you want to store each line that meets a certain criteria, make a two dimensional array (first dimension would be $index, the second the $count [$index ] value before you increment it).