in reply to Re: 'Simple' comparing a hash with an array
in thread 'Simple' comparing a hash with an array

Thank you monks. I've learnt a lot from this thread :-) I had originally used a hash after a perlfaq4 suggestion about comparing arrays, which would avoid some iteration. However, I ended up iterating over the hash anyway as I didn't know any other ways :-) As some have asked, sample input to this problem is:
WE regret that a press of matter prevents our noticing
I want to count frequencies of certain words, as listed in mr_mischief's %histogram. As for output something along the line of the following is what I'm trying to obtain:
Found for, 1 times. Found such, 1 times. Found up, 1 times. Found at, 2 times. Found had, 1 times. Found was, 1 times. Cumulative total of all words found: 50
To obtain this, and because I don't need to be concerned about the case of the input, I have added to mr_mischief's elegant code very slightly:
#!/usr/bin/perl use strict; use warnings; my %histogram = map { $_ => 0 } qw( a am an and are as at be been but by can co de do due each ); # hash of the words to find so we can do an O(1) lookup for them while ( <> ) { chomp; for ( split ) { # split returns a list we can use directly tr/A-Z/a-z/; # lowercase all input print "$_\n"; $histogram{ $_ }++ if exists $histogram{ $_ }; # only store counts for words that matter } } my $count=0; foreach my $word ( keys %histogram ) { # keys() will list the keys, and we've already taken care # of making sure we don't have extra words stored. # Now there's no need to do two loops and check an array # against a hash. if ($histogram{$word} >0) { print "Found $word, $histogram{$word} times.\n"; $count = $count + $histogram{$word}; } } print "Cumulative total of all words found: $count\n";
Thanks again

Replies are listed 'Best First'.
Re^3: 'Simple' comparing a hash with an array
by mr_mischief (Monsignor) on Apr 17, 2008 at 15:28 UTC
    If you're at all worried about locale and language issues, or if you're just concerned about doing things the canonical way, you can use $_ = lc $_; instead of tr/A-Z/a-z/; to get a lowercase version. lc and uc are built in, and they honor the current language and localization settings.