And use the titles (defined as the word on any line beginning with > while any other words are ignored i.e. Word1) as a wordlist to search another .txt file by and ultimately print the frequency at which each title occurs in this second .txt file. To clarify with an example: If the titles are:>Title1 Word1 >Title2 Word2 >Title3 Word3
And the file i'm searching within is:>Apple >Banana >Grape
The output desired would be:Apple Banana Avocado Orange Grape Apple Apple Banana Banana
I've seen various snippets of code around which take wordlists and then search files for them, printing the frequency such as:Apple occurs: 3 Banana occurs: 3 Grape occurs: 1
But I am unsure as to how I can populate the wordlist with the titles or contents of another file rather than specifying them within the code. If anybody could provide some useful suggestions or snippets of code which I could work on modifying, that would be great - thank you! - TJCsub by_count { $count{$b} <=> $count{$a}; } open(INPUT, "<Input.txt"); open(OUTPUT, ">WordFreqs.txt"); $bucket='red|blue|green'; while(<INPUT>){ @words = split(/\s+/); foreach $word (@words){ if($word=~/($bucket)/io){ $count{$1}++;} } } foreach $word (sort by_count keys %count) { print OUTPUT "$word occurs $count{$word} times\n"; } close INPUT; close OUTPUT;
In reply to Loading words from one file, searching another for the frequencies of these words and outputting the wordcounts to another file by TJCooper
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |