in reply to Re: Word frequency in an array
in thread Word frequency in an array

Ok, your solutions:

The first one is less than optimal. First, you're starting with $x = 1, which means that after the loop terminates $x will overstate the count by one. Why not start with $x = 0, and then pre-increment instead of post-incrementing $x? In other words, ++$x, instead of $x++. The next issue is the regexp you used. It will match just about anything containing "foo", including "foolish". Is that intentional? Maybe /^foo$/ would be better, or perhaps /\bfoo\b/. And the last thing to mention is the use of print within the loop. You're printing on each iteration, which creates an IO bottleneck, plus a lot of clutter. If $x started at zero, you could print after the loop terminates.

Your second solution goes to a lot of extra work and memory inefficiency by creating $str as a temporary stringified version of @arr. And the other problem is that $& only shows the actual most recent match, not some count of the number of possible times the regular expression could have matched. Don't use a special variable, use this:

my $count = () = $str =~ m/\bfoo\b/g;

But I still feel it's a bad solution because you're creating a temporary string unnecessarily.

The grep solution is probably the best for a one-time count. The hash solution is probably better if you're doing the count several times, but it does have two problems: you're still creating the temporary copy (the hash), and the creation of a hash is a more computationally expensive operation than running through the array one time counting, as is done in the grep method.

One other thing: "utioecia". There you go; the keystrokes you saved by abbreviating "special" and "solution." You can cut and paste them into your future posts so that you can retain clarity without wasting those eight keystrokes. ;)


Dave

Replies are listed 'Best First'.
Re^3: Word frequency in an array
by blazar (Canon) on Jun 11, 2007 at 17:06 UTC
    Why not start with $x = 0, and then pre-increment instead of post-incrementing $x? In other words, ++$x, instead of $x++.

    Indeed. In fact it's also worth reminding incidentally that {pre,post}-{increment,decrement} behave intelligently by first of all not complaining under warnings and, in the case of post-ones, to "coerce to numeric value", that is, to return 0:

    errol:~ [19:01:32]$ perl -wMstrict -le 'my $x; print map $x++, (1) x 3 +' 012
Re^3: Word frequency in an array
by cool (Scribe) on Jun 10, 2007 at 19:03 UTC
    Hi Dave,

    Thanks for giving insight of the solutions and giving me prototypes to copy and paste ;)

    And the other problem is that $& only shows the actual most recent match, not some count of the number of possible times

    Actually that is what I mentioned pl read comments in

    #! /usr/bin/perl use strict; use warnings; my $x=1; my @arr= qw(foo cho roh foo kho foo moo foo); my $str=join ' ',@arr; $str=~ /foo/; print $&; #### In place of $&; we can use that for no #### for no. of matches.
    Now, I posted this piece to get suggestion from people, what in regular expression can be used (in place of $&) But it can be done using spl variable of reg ex also, if I am right?? Any takers?

    to count the no of matches in one go using special variable, if there is any!! and I think I encountered that somewhere!