in reply to Re: counting instances of one array in another array
in thread counting instances of one array in another array

Thank you very much for your reply. I get the general idea, but my problem is that I need to end up with an array of size @perms that has counts of each item from @perms that occur in @observed. The reason is that what I really will do is generate that coding for each of many words, generate the array giving the counts of observed patterns from the entire set of possible patterns, and then calculate similarity of those resulting vectors as an index of word similarity.

Your solution provides a hash with counts for each item in @observed, but then I need to associate those counts with the position of the corresponding element in @perms in @permCounts (or %permCounts!).

So if:

@observed = ("ab", "ab", "ad", "an", "bd", "bn", "dn");

and:

@perms = ("aa", "ab", "ac", "ad", "ae", "af", "ag", "an", "ba", "bd", "bh", "bn", "dn"); # @perms would never be something like this, # but just to give an example

Then the result I want is that @permCounts would be:

(0, 2, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1)

Thanks very much,

jim

Replies are listed 'Best First'.
Re^3: counting instances of one array in another array
by JavaFan (Canon) on Feb 23, 2012 at 13:22 UTC
    Well, good luck then. Be aware that there are 456976 possible combinations with words of length 4 (using the 26 letters of the English alphabet), and 308915776 words of length 6. The latter will take about 5.75Gb to store. If you want ngrams up to 8 characters, look around for a box with 3.7Tb of memory. And if you need to go to ngrams of length 12, you'd be looking at more than 1.5 Eb.
Re^3: counting instances of one array in another array
by jandrew (Chaplain) on Feb 23, 2012 at 17:07 UTC
    jsmagnuson Any chance that Math::Fleximal will allow you to treat your range of possible combinations as a number sequence and do math on the key to find the position? I like the hash-key solution and treating the keys as fleximals may solve your position question without building a full array of possibilites.
      For example

      Update: key value look-up was wrong - fixed

Re^3: counting instances of one array in another array
by Anonymous Monk on Feb 23, 2012 at 14:42 UTC
    my %seen; $seen{$_}++ for @observed; @permCounts = map { $seen{$_} || 0 } @perms;
      And then you're back to the OPs problem again: the array @perm. Which will be huge.