Re: Looping through a file, reading each line, and adding keys/editing values of a hash

You have a great start on your script! As for your 'struggles': 1) yes, and ++ is a perfectly appropriate and a common construct; 2) The value for what? If you mean incrementing the hash value, you're doing it correctly. If you mean $row[18] to get the value of col 19, you're doing it correctly.; and 3) no.

Here are some suggested changes to consider for your script:

use strict;
use warnings;

my $filename = $ARGV[0];
my %gene_count;

open my $fh, '<', $filename or die "Cannot open $filename: $!";

while ( my $line = <$fh> ) {
    chomp;
    my @row = split( "\t", $line );
    $gene_count{ $row[18] }++ if $row[18];
}

close($fh);

print "$_ => $gene_count{$_}\n" for sort keys %gene_count;
[download]

$ARGV{0} -> $ARGV[0]
Made a few changes to your open
Added chomp because you're splitting on the tab character. If you don't chomp, a newline will be on the end of the array's last element (with the exception of the file's last line).
= ++ -> ++
Added if $row[18] to check for 'good' key candidate. This check could be stronger, but is likely sufficient, in this case.
Just fyi: The parens of split and close are optional.
Added printing the sorted key/value pairs. (Just assumed you wanted to do that... :)

Since you're sending your script the filename from the command line, you can let Perl handle the file i/o. If you split on ' ' (whitespace) you don't need to chomp. Also, you can send split a LIMIT to its splitting, so it's not splitting all columns. Using this LIMIT can significantly speed the splitting process. Given this, the following is functionally equivalent:

use strict;
use warnings;

my %gene_count;

while (<>) {
    my @rows = split ' ', $_, 20;
    $gene_count{ $row[18] }++ if $row[18];
}

print "$_ => $gene_count{$_}\n" for sort keys %gene_count;
[download]

Your original script's logic is good; only minor fixes were needed. You've done well...

Hope this helps!

Comment on Re: Looping through a file, reading each line, and adding keys/editing values of a hash Select or Download Code

Replies are listed 'Best First'.
Re^2: Looping through a file, reading each line, and adding keys/editing values of a hash by GrandFather (Saint) on Dec 05, 2013 at 08:12 UTC
Why use a post-increment instead of a pre-increment when the value is not being used? `$gene_count{ $row[18] }++` is (imo) better written `++$gene_count{$row[18]}` so the increment is obvious. True laziness is hard work	[reply] [d/l] [select]
Re^3: Looping through a file, reading each line, and adding keys/editing values of a hash by Kenosis (Priest) on Dec 06, 2013 at 16:48 UTC
Interesting question. One reason is that the OP already attempted a post-increment, and since the two increment types would produce the same outcome, why make the change? Another reason is my personal preference for this counting situation. If, for example, I were on a sidewalk, tallying all the red cars that passed by, I wouldn't make a tally mark upon their approach (pre-increment), but rather after they crossed an imaginary line extending across the street from my position (post-increment). Perhaps this ultimately boils down to personal preference, in cases like these...	[reply]
Re^3: Looping through a file, reading each line, and adding keys/editing values of a hash by Anonymous Monk on Dec 05, 2013 at 20:16 UTC
There are different schools of thought there. I strongly favour the postincrement.	[reply]
Re^4: Looping through a file, reading each line, and adding keys/editing values of a hash by GrandFather (Saint) on Dec 05, 2013 at 20:47 UTC
I know its strongly favoured by many. I don't understand why. My strong preference is to use pre-increment where possible so the increment operator is more easily seen (trailing stuff is more easily ignored). But maybe I'm missing something important about the post-increment? A very minor consideration may be that the post-increment could be slower for some implementations that the pre-increment. The difference is so slight that it would be exceptionally unusual for that to be a consideration. True laziness is hard work	[reply]
Re^5: Looping through a file, reading each line, and adding keys/editing values of a hash by choroba (Cardinal) on Dec 06, 2013 at 17:10 UTC
Re^2: Looping through a file, reading each line, and adding keys/editing values of a hash by Anonymous Monk on Dec 05, 2013 at 05:58 UTC
So, so helpful! Thank you. (And glad to see that I was on the right track and didn't need any major re-organizing). Thanks again! Cheers, Amelia	[reply]
Re^3: Looping through a file, reading each line, and adding keys/editing values of a hash by Kenosis (Priest) on Dec 05, 2013 at 06:01 UTC
You're most welcome, Amelia!	[reply]