in reply to Calculating Total Different Array Terms On All Lines of Datafile

For your example (as there are only families and members) it should be the sum of the number of keys in the %family hash and the number of keys in the %people hash.
my $total_terms = $total_families + scalar (keys %people);

scalar (keys %hash) is actually the total number of different keys in a hash (exept you have multikeyed hashes, e.g. with BerkeleyDB), since there are no duplicate keys in a hash. And you use all terms of your data as keys to one or another hash.

You could also introduce a hash only for the purpose of storing each word found in the datafile:

while(<DATA>) { $seen{$1}++ while /(\w+)/g; } print scalar(keys %seen);

--shmem

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}