aschwa has asked for the wisdom of the Perl Monks concerning the following question:

I'm new to coding and perl and am stuck with a problem. I've looked in other threads and googled the problem but I haven't found a solution yet. Part of my concern is that I'd like to understand why it's not working rather than just getting a quick fix to the problem. I want this code to count the number of times that each string in an array appears in that array. So, as an example, if-

@array = qw('dog', 'cat', 'sheep', 'dog', 'dog', 'cat');

It would return

>dog, 3 >cat, 2 >sheep, 1

The code that I'm using is below. Currently, for any input, it only gives a count of "1" as the number of occurrences in the array. I know this to be false as I've created test arrays with multiple identical values. What am I doing wrong, Monks?

use strict; use warnings; my $i; my $j; my @data; for my $file (@ARGV) { open (RAW, "./$file") || die "Cannot open specified file to be process +ed\n"; while(<RAW>) { @data = join('', $_ =~ /^(\d\d\d\d)-(\d\d)-(\d\d) (\d+):(\d+): +(\d+)/); for $i (@data) { my $cnt = 0; for $j (@data){ if ($i eq $j) { $cnt = $cnt + 1; print "$i, $j, $cnt\n"; } } } } }

Replies are listed 'Best First'.
Re: How do I count the number of occurrences of a string in an array?
by pme (Monsignor) on Aug 13, 2015 at 17:09 UTC
    Hi aschwa,

    Welcome to the monastery!

    You can simply convert the array into a hash.

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my @array = qw(dog cat sheep dog dog cat); my %hash; $hash{$_}++ for @array; # dump the hash to stdout print Dumper( \%hash ) . "\n"; # print each key-value pairs for (sort keys %hash) { print "$_ -> $hash{$_}\n"; }
Re: How do I count the number of occurrences of each string in an array?
by karlgoethebier (Abbot) on Aug 13, 2015 at 19:29 UTC
    "... rather than just getting a quick fix to the problem"

    Yes sure. But i guess that what BrowserUK answered in 2002 might be of interest ;-) I modified the code a bit:

    #!/usr/bin/env perl use strict; use warnings; my @array = qw(dog cat sheep dog dog cat); my ( $temp, $count ) = ( "@array", 0 ); ( $count = $temp =~ s/($_)//g ) and printf "%2d:%s\n", $count, $_ for +@array; __END__ karls-mac-mini:monks karl$ ./1138451.pl 3:dog 2:cat 1:sheep

    Please see Re: Count duplicates in array. for the original code.

    Best regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

Re: How do I count the number of occurrences of each string in an array?
by AnomalousMonk (Archbishop) on Aug 13, 2015 at 22:04 UTC
    ... I'd like to understand why it's not working rather than just getting a quick fix to the problem.

    It doesn't work because there is only (and can only ever be) a single element in the  @data array. This is because join returns a single string from all the sub-strings it joins together. (Update: If there were no sub-strings to join together, e.g., if the match had failed, join would still return a single (empty) string.)

    c:\@Work\Perl\monks>perl -wMstrict -le "use Data::Dump qw(dd); ;; $_ = '2015-08-13 17:53:45 blah yada'; my @data = join('', $_ =~ /^(\d\d\d\d)-(\d\d)-(\d\d) (\d+):(\d+):(\d+ +)/); ;; print 'number of elements in array: ', scalar @data; ;; print qq{element at index $_: '$data[$_]'} for 0 .. $#data; ;; dd \@data; " number of elements in array: 1 element at index 0: '20150813175345' [20150813175345]
    So in a sense your code does work: it loops through the single element in the array and determines that that element is, indeed, exactly equal to itself: one duplicate.

    Update: So the take-away lesson is know your data. What was in the array to begin with? Tools like Data::Dumper and Data::Dump are life-savers when it comes to debugging questions like this for yourself. For a bright, confident outlook on life, dump early and dump often.


    Give a man a fish:  <%-(-(-(-<

Re: How do I count the number of occurrences of each string in an array?
by ExReg (Priest) on Aug 13, 2015 at 22:27 UTC

    pme and karlgoethebier have both given great answers on how to solve this problem.

    pme uses a hash to get the count. As you go through each element in @array, a hash key/value pair is either created or updated. If the key does not exist, for example, the first time dog is come across in @array, a key is created in %hash. The value is incremented to 1. The same applies to the first time for cat and sheep. The second and third times an item is come across, the key already exists, so the key is not created again, but the value is incremented.

    karlgoethebier goes through the array and uses the g modifier in the regex to substitute all the occurrences of each element with nothing while at the same time capturing the number of times the substitution was made. The s substitution operator will return the number of substitutions made when used with g.

    Looking at your code, I do get values other than 1. I get

    dog, dog, 1 dog, dog, 2 dog, dog, 3 cat, cat, 1 cat, cat, 2 sheep, sheep, 1 dog, dog, 1 dog, dog, 2 dog, dog, 3 dog, dog, 1 dog, dog, 2 dog, dog, 3 cat, cat, 1 cat, cat, 2

    It appears that what you are really processing is a file with lines that start with date/time stamps, but that is really immaterial. What you are doing is going through the data loop once for the first item ($i = 'dog'), then again through the whole loop when $i = 'cat', and so on, up to the last cat.

    The first time through, $i and $j will both match on the first dog, so $cnt is 1, then again on the second dog, so $cnt is 2, then again on the third dog, so $cnt is three. That is all on the first loop with $i = 'dog'.

    On the next loop, $i becomes 'cat' and $cnt is set back to zero. On the first cat (the second item), $i and $j are equal, so $cnt is bumped up to one. On the second cat (the last item), it is bumped up to 2.

    $i takes on sheep, dog, dog, and cat for the following outer loops. We get the rest of the printout above. You could determine what the number of each animal is by looking at your printout, but you probably didn't want it in the form you have. You have it set to print each time your $i and $j are equal, so it will show the count in progress each time it goes through the $j loop. What you really wanted was the grand total at the end. For that, you would better be served by pme or karlgoethebier solutions.

    NOTE: I changed qw('dog', 'cat', 'sheep', 'dog', 'dog', 'cat') to qw( dog cat sheep dog dog cat) since qw is rather adept at making a list out of just words. Adding the single quotes and commas will just confuse things and put extra quotes and commas in the list and make the first cat not match the second.

Re: How do I count the number of occurrences of each string in an array?
by CountZero (Bishop) on Aug 14, 2015 at 06:40 UTC
    @data = join('', $_ =~ /^(\d\d\d\d)-(\d\d)-(\d\d) (\d+):(\d+):(\d+)/); is your problem.

    You are not adding a new element to the array, you are actually resetting the array each time and then adding one element. So your loop will always find that one element only.

    Adding another element to an array is done with the push operator.

    push @data, join('', $_ =~ /^(\d\d\d\d)-(\d\d)-(\d\d) (\d+):(\d+):(\d+)/);

    Try it and seen if it works "better". Of course using a hash is even better.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics