If tag1 and tag3 are equal, this is overly complicated.
This line:
conferences NN conference
might be wrong?
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper qw(Dumper);
my %hash;
while (my $line = <DATA>)
{
next if $line =~ /^\s*$/; # skip blank lines
my ($tag1, $tag2, $tag3) = split(/\s+/, $line);
next unless $tag2 eq 'NN';
$hash{$tag3}++;
}
print Dumper \%hash;
=prints
$VAR1 = {
'well' => 1,
'conference' => 3,
'International' => 1,
'preparation' => 2
};
=cut
__DATA__
The DT the
International NN International
for IN for
well NN well
preparation NN preparation
preparation NN preparation
in IN in
conference NN conference
conference NN conference
conferences NN conference
good VVG good
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.