Re^2: 'Simple' comparing a hash with an array

Thank you monks. I've learnt a lot from this thread :-) I had originally used a hash after a perlfaq4 suggestion about comparing arrays, which would avoid some iteration. However, I ended up iterating over the hash anyway as I didn't know any other ways :-) As some have asked, sample input to this problem is:

WE
regret
that
a
press
of
matter
prevents
our
noticing
[download]

I want to count frequencies of certain words, as listed in mr_mischief's %histogram. As for output something along the line of the following is what I'm trying to obtain:

Found for, 1 times.
Found such, 1 times.
Found up, 1 times.
Found at, 2 times.
Found had, 1 times.
Found was, 1 times.
Cumulative total of all words found: 50
[download]

To obtain this, and because I don't need to be concerned about the case of the input, I have added to mr_mischief's elegant code very slightly:

 
#!/usr/bin/perl
use strict;
use warnings;

my %histogram = map { $_ => 0 } qw(
a am an and are as at be been but by can co de do due each  
); # hash of the words to find so we can do an O(1) lookup for them

while ( <> ) {
    chomp;
    for ( split ) {
    # split returns a list we can use directly
        tr/A-Z/a-z/; 
        # lowercase all input
        print "$_\n";
        $histogram{ $_ }++ if exists $histogram{ $_ };
        # only store counts for words that matter
    }
}

my $count=0;
foreach my $word ( keys %histogram ) {
# keys() will list the keys, and we've already taken care
# of making sure we don't have extra words stored.
# Now there's no need to do two loops and check an array
# against a hash.
        if ($histogram{$word} >0) {
          print "Found $word, $histogram{$word} times.\n";
          $count = $count + $histogram{$word};
        }
}

print "Cumulative total of all words found: $count\n";
[download]

Thanks again

Comment on Re^2: 'Simple' comparing a hash with an array Select or Download Code

Replies are listed 'Best First'.
Re^3: 'Simple' comparing a hash with an array by mr_mischief (Monsignor) on Apr 17, 2008 at 15:28 UTC
If you're at all worried about locale and language issues, or if you're just concerned about doing things the canonical way, you can use `$_ = lc $_;` instead of `tr/A-Z/a-z/;` to get a lowercase version. lc and uc are built in, and they honor the current language and localization settings.	[reply] [d/l] [select]