Re: What's the best way to do a pattern search like this?

Neglecting for a moment that the devil is in the details:

sub word_count {
  my %h;
  $h{$_}++ for pop =~ /\w+/g;
  %h;
}

## Example ##

use Data::Dumper;
my $s = 'fee fi fo fo fi fee fo fum fum bar baz';
my %h = word_count($s);
print Dumper \%h;
[download]

You may want to replace the regex with something like /[a-z]+(?:'[a-z]+)?/gi, in order to properly count conjunctive words.

   MeowChow                                   
               s aamecha.s a..a\u$&owag.print

Comment on Re: What's the best way to do a pattern search like this? Select or Download Code

Replies are listed 'Best First'.
Re: Re: What's the best way to do a pattern search like this? by tachyon (Chancellor) on Jul 20, 2001 at 10:56 UTC
supernewbie wanted an explanation of MeowChows sub: First declare the sub sub word_count { Next we declare a lexically scoped has called %h the % indicates that this is a hash and the h is a typical MeowChow explanatory long var name :-) my %h; This is a bit of very idiomatic perl $h{$_}++ for pop =~ /\w+/g; It is fairly easy to understand if you read it R->L. The expression: pop =~ /\w+/g pop()s the last value off @_ which is the array passed to a subroutine called like `mysub(@myarray)`. This gets us the value passed to the sub. We then use a regular expression to match \w+ which is groups of letters (as many in a row a possible) but not whitespace. Because this is referenced in LIST context by the `for` it returns a list of words which the for iterates over assigning each value to the magical `$_` variable. Finally we use out hash to count the occurances of each word (code). A hash stores a key value pair. Thus the key we are using is $_. The `++` part increments the value of `$h{$_}` by one each time we see the key. %h In a perl sub the sub returns the last value evaluated so this is shorhand for the more usual `return %h` Hope this helps cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply]

Replies are listed 'Best First'.

Re: Re: What's the best way to do a pattern search like this?
by tachyon (Chancellor) on Jul 20, 2001 at 10:56 UTC

supernewbie wanted an explanation of MeowChows sub:

First declare the sub

sub word_count {

Next we declare a lexically scoped has called %h the % indicates that this is a hash and the h is a typical MeowChow explanatory long var name :-)

my %h;

This is a bit of very idiomatic perl

$h{$_}++ for pop =~ /\w+/g;

It is fairly easy to understand if you read it R->L. The expression:

pop =~ /\w+/g

pop()s the last value off @_ which is the array passed to a subroutine called like mysub(@myarray). This gets us the value passed to the sub. We then use a regular expression to match \w+ which is groups of letters (as many in a row a possible) but not whitespace. Because this is referenced in LIST context by the for it returns a list of words which the for iterates over assigning each value to the magical $_ variable.

Finally we use out hash to count the occurances of each word (code). A hash stores a key value pair. Thus the key we are using is $_. The ++ part increments the value of $h{$_} by one each time we see the key.

%h

In a perl sub the sub returns the last value evaluated so this is shorhand for the more usual return %h

Hope this helps

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

[reply]