in reply to Packet Patterns redux - my script can't count

The first thing that comes to mind as a possible source of your problem is the way you are comparing, using a regular expression:
my @temparr = ($packets[$pnum] =~ /$str/g);
This is potentially inaccurate because if the ASCII representation of $str includes any regular expression metacharacters (like dots) it may count things that should not match. It is also potentially dangerous because some of those characters may cause your program to crash because they form an invalid regex. You could get around that by using \Q and \E, like this:
my @temparr = ($packets[$pnum] =~ /\Q$str\E/g);
The second thing is that the logic of your loops seems a little too convoluted. It could probably be rewritten as something like this (untested):
foreach my $p (@packets) { foreach my $l (2..length($p)) { foreach my $pos (0..length($p)-$l) { my $str=substr($p, $pos, $l); $all{$str}+=$_ for map { scalar(($_ =~ /\Q$str\E/g)) } @packets; } } }
The map builds a list with the count of how many times $str appears in each element from @packets, and the for in that same line adds all those counts to the corresponding element of %all. I think it's essentially the same algorithm you had before, except that it does count the current packet in the matches.

--ZZamboni

Replies are listed 'Best First'.
(Guildenstern) RE: Re: Packet Patterns redux - my script can't count
by Guildenstern (Deacon) on Aug 08, 2000 at 22:28 UTC
    Wow! Thanks!
    Added your code and edited mine to fit it, and I get numbers that are closer to what I'm expecting. I'd love to say they're perfect, but now it appears that each value is in my list with its count value doubled. (Of course, I have the occasional 3 or 5, but I think I know where those are coming from.)
    What's happening is that some packets are nearly identical to each other, so when packet X is tested, the map finds N matches with packet Y. But, when packet Y is tested, the same N matches are found with packet X. I'll bang my head on the desk for a minute or two and see if the answer rattles loose.
    Again, thanks for the help. Not only did it correct the behavior, but it's ever so much more elegant. Too bad I've already used my votes today.