Within your problem, there is also the issue of what constitutes a word. I'm going to ignore the fact that a word cannot contain two hyphens next to each other, or two apostrophes, etc. For one thing, once I start down that road, the next thing you know, I'll be looking for spelling errors, and that's just beyond the scope of actual need. For the purposes of my example, I'll just strip anything that doesn't belong in a word out of a word, including punctuation, and assume that what's left is a word.
I decided to interpret your question as saying that you have a set of comma delimited strings, and that each substring might contain multiple words, but that you want to get a total word-count. I realize that you might want phrase-counts instead of word counts, but this is my spoiler, so I'll pick word-counts because doing so adds an extra level of fun.
I took the additional liberty of lower-casing all words, so that comparing "ApPleS" to "apples" and "APPLES" (but not "oranges") will be all the same thing.
In this example, I also made sure that lexical variables all fall out of their narrow a scope as early as possible. That's the sole reason for the outter-most { ... } block. ...It's really not necessary, but I was just fiddling and it came out this way.
If you're ready for the spoiler, read on. If you're not ready for it, don't:
use strict; use warnings; use Text::CSV; my %wordlist; { my $csv = Text::CSV->new(); while ( my $line = <DATA> ) { $csv->parse( $line ) or die "Improperly formatted CSV string: $line"; foreach my $field ( $csv->fields() ) { foreach my $word ( split /\s+/, $field ) { next unless $word; $word =~ s/[^[:alpha:]'-]//g; $wordlist{lc $word}++; } } } } printf "%-16s: $wordlist{$_}\n", $_ for sort keys %wordlist; __DATA__ hi, there, world, how, are you, today? What are you up to? Here's a word with an apostrophe. test3
Enjoy! Thanks for the fun question. Finally I found a reason to install Text::CSV.
Dave
In reply to Re: Comma separated list into a hash (SPOILER)
by davido
in thread Comma separated list into a hash
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |