comment on

Good morning monks,
Hope the regex experts out there can help out on this one: I have a large file of international news stories and want to count how many stories there are from each country (doesn't have to be exact). The file is formatted like so:

Headline of story is here

Text of story is here 
Text of story is here 
Text of story is here 

Headline of story is here

Text of story is here 
Text of story is here 
Text of story is here
[download]

Eyeballing the file shows that most headlines do in fact have the country name in them, so it seems like OWTDI would be just to count the occurrences of country names in the headlines only, ignoring the text. How can I modify the regex in the following loop to do that?

while (<NEWS>) {
    foreach my $country (@countries) {
    $story_count{$country}++ if m/$country/gi;
    }
}
[download]

Thanks in advance, mooseboy

In reply to Counting words in headlines by mooseboy

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.