cklatsky has asked for the wisdom of the Perl Monks concerning the following question:

I have file that has this data, but I do not know it in advance:

one two three one five two ten one

I want to count the number of times each of those strings appears in the file. I have a script with a nested loop that reads each line of the file in the outer loop, then the inner loop checks line by line in the same file for a match and prints it out. As you can see from the file, it will check for string "one" three times and (correctly) find it is matched three times. But I do not need to have count again when it encounters the same string it counted before. Hopefully, that explanation makes sense.

#!/usr/bin/perl use strict; use warnings; use IO::File; open (FILE, 'numbers.txt'); my @fileone = <FILE>; close (FILE); open (FILE, 'numbers.txt'); my @filetwo = <FILE>; close (FILE); my $count1="0"; my $line; my $pattern; foreach $pattern (@fileone) { chomp $pattern; foreach $line (@filetwo) { chomp $line; if ($line =~ /$pattern/) { $count1++; #my @elements = split ('~',$line); #my $cmts = $elements[1]; } } print "$pattern appeared $count1\n"; $count1="0"; } $ ./test_v2.pl one appeared 3 two appeared 2 three appeared 1 one appeared 3 five appeared 1 two appeared 2 ten appeared 1 one appeared 3

Replies are listed 'Best First'.
Re: Count of patterns in a file without knowing pattern in advance
by choroba (Cardinal) on Jun 07, 2019 at 15:40 UTC
    This is a FAQ.
    my %seen; while (<$FILE>) { chomp; ++$seen{$_}; } for my $string (keys %seen) { print "$string encountered $seen{$string} time(s).\n"; }

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Count of patterns in a file without knowing pattern in advance
by hippo (Archbishop) on Jun 07, 2019 at 15:44 UTC

    Use a hash, that way you only need to take one pass over the file:

    #!/usr/bin/env perl use strict; use warnings; my %terms; for (<DATA>) { chomp; $terms{$_}++; } for my $key (keys %terms) { print "$key appeared $terms{$key}\n"; } __DATA__ one two three one five two ten one

    Update: Well it's very pleasing to find that no less a luminary than Brother choroba has arrived at the exact same answer, albeit of course much faster!

      Thanks to both for the solution. I have that working now.
        There is usually a "one liner" alternative for simple requirements.

        I offer:

        perl -anE "$h{$F[0]}++}{say qq|$_ appeared $h{$_}| for sort keys %h" +*YOUR-FILE*
        Use single quotes if you are on Linux.

                        Time is an illusion perpetrated by the manufacturers of space.