Re: 'Simple' comparing a hash with an array

Let's start by eliminating unnecessary steps and by not storing more values into memory than necessary. By getting rid of the fluff, you might be able to see the problem and solution more clearly.

#!/usr/bin/perl
use strict;
use warnings;

my %histogram = map { $_ => 0 } qw(
    a am an and are as at be been but by can co de do due each
); # hash of the words to find so we can do an O(1) lookup for them

while ( <> ) {
    chomp;
    for ( split ) {
    # split returns a list we can use directly
        $histogram{ $_ }++ if exists $histogram{ $_ };
        # only store counts for words that matter
    }
}

foreach my $word ( keys %histogram ) {
# keys() will list the keys, and we've already taken care
# of making sure we don't have extra words stored.
# Now there's no need to do two loops and check an array
# against a hash.
    print "Found $word, $histogram{$word} times.\n";
}
[download]

Comment on Re: 'Simple' comparing a hash with an array Download Code

Replies are listed 'Best First'.
Re^2: 'Simple' comparing a hash with an array by Anonymous Monk on Apr 17, 2008 at 14:16 UTC
Thank you monks. I've learnt a lot from this thread :-) I had originally used a hash after a perlfaq4 suggestion about comparing arrays, which would avoid some iteration. However, I ended up iterating over the hash anyway as I didn't know any other ways :-) As some have asked, sample input to this problem is: `WE regret that a press of matter prevents our noticing` [download] I want to count frequencies of certain words, as listed in mr_mischief's %histogram. As for output something along the line of the following is what I'm trying to obtain: `Found for, 1 times. Found such, 1 times. Found up, 1 times. Found at, 2 times. Found had, 1 times. Found was, 1 times. Cumulative total of all words found: 50` [download] To obtain this, and because I don't need to be concerned about the case of the input, I have added to mr_mischief's elegant code very slightly: #!/usr/bin/perl use strict; use warnings; my %histogram = map { $_ => 0 } qw( a am an and are as at be been but by can co de do due each ); # hash of the words to find so we can do an O(1) lookup for them while ( <> ) { chomp; for ( split ) { # split returns a list we can use directly tr/A-Z/a-z/; # lowercase all input print "$_\n"; $histogram{ $_ }++ if exists $histogram{ $_ }; # only store counts for words that matter } } my $count=0; foreach my $word ( keys %histogram ) { # keys() will list the keys, and we've already taken care # of making sure we don't have extra words stored. # Now there's no need to do two loops and check an array # against a hash. if ($histogram{$word} >0) { print "Found $word, $histogram{$word} times.\n"; $count = $count + $histogram{$word}; } } print "Cumulative total of all words found: $count\n"; [download] Thanks again	[reply] [d/l] [select]
Re^3: 'Simple' comparing a hash with an array by mr_mischief (Monsignor) on Apr 17, 2008 at 15:28 UTC
If you're at all worried about locale and language issues, or if you're just concerned about doing things the canonical way, you can use `$_ = lc $_;` instead of `tr/A-Z/a-z/;` to get a lowercase version. lc and uc are built in, and they honor the current language and localization settings.	[reply] [d/l] [select]
Re^2: 'Simple' comparing a hash with an array by wade (Pilgrim) on Apr 17, 2008 at 16:23 UTC
++mr_mischief, nice design! -- Wade	[reply]