# declare our vars
my (%codes, @array_codes);
#undef input record sep to get all data at once
local $/;
# make an array of codes by splitting DATA on whitespace
@array_codes = split /\s+/, <DATA>;
# map the codes to a hash, counting duplicates
# using a for loop for efficiency
foreach $code_key (@array_codes) {
$codes{$code_key}++;
}
# print it out
printf "$_\t$codes{$_}\n" for keys %codes;
__DATA__
baaba ba abab abab abab baaba baaba babaa.
abab aaba ba abab ba. bababab abab abab ba aaba.
ba bababab aaba abab babaa baaba ba baaba.
aaba ba bababab ba bababab abab ba aaba abab baaba abab.
ba abab abab ba.
Note that: map{....}@array is just another way of writing: for (@array) { .. }. To do it
to a file all you need to do to use this is do somthing like:
sub count_codes {
my $file = shift;
open (FILE, "<$file") or die "Oops, perl says $!\n";
local $/;
my @array_codes = split /\s+/, <FILE>;
close FILE;
foreach $code_key (@array_codes) {
$codes{$code_key}++;
}
printf "$_\t$codes{$_}\n" for keys %codes;
}
# call sub
count_codes("/path/to/myfile.txt");
You have some full stops in there which I have assumed are part of the codes. If they are not you
will need to filter them out using a regex in our for loop like this:
foreach $code_key (@array_codes) {
$code_key =~ s/[.]//g;
$codes{$code_key}++;
}
If you want filter out more characters add them to the character class between the [ ]
cheers
tachyon
Update
Removed lazy and inefficient map and replaced with proper
for loop. Even typed foreach to remind me not to be so slack.
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
|