Re: Count number of occurrences of a list of words in a file

In the the three solutions that where posted, everyone uses 'exists $hash{key}' and a '$hash{key}=...' and to me these look like two look-ups in the hash to me. Is this efficient? Can this be more efficient?

Athanasius

++$count{$word} if exists $count{$word};

Tux

$cnt{$_}++ for grep { exists $cnt{$_} } m/(\w+)/g;

AnomalousMonk

exists $count{$_} and ++$count{$_} for $line =~ m{ $rx_word }xmsg;

Comment on Re: Count number of occurrences of a list of words in a file Select or Download Code

Replies are listed 'Best First'.
Re^2: Count number of occurrences of a list of words in a file by Cristoforo (Curate) on May 09, 2018 at 22:28 UTC
Athanasius identified the code slowdown here. Hash lookups are the fastest way here and generally. The 2 lookups are necessary because you have to verify that the word being checked exists in the counting hash. Otherwise, without this check, a new word (not to be searched for) would be erroneously counted in the hash.	[reply]
Re^2: Count number of occurrences of a list of words in a file by AnomalousMonk (Archbishop) on May 09, 2018 at 22:27 UTC
My assumption was that there might be many things in Azaghal's input `textfile.txt` that look like "words", and he or she only wanted to count the words specified in the `list.txt` file. If that's the case, one must check that a "word" exists before incrementing it else one will autovivify a "word" that was not previously present. Hence, two hash accesses are necessary. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]