parsing a sentence?

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks
I am taking a sentence from a tagged text and splitting it in words/tag (It/PRP). Them I am taking the sequence of two pairs of words, and tags (separated) to compare the tag with some pattern.
My problem is the way I code it (clueless), the last word of the sentence repeats (no word to combine), e.g.

This is my input: It/PRP really/RB does/VBZ seem/VB to/TO violate/VB a/DT lot/NN of/IN boundaries/NNS
this is my wrong output:

PRP/RB
It really
RB/VBZ
really does
...
IN/NNS
of boundaries
NNS/NNS
boundaries boundaries

1. Please,.. Can somebody help me to solve this?!!

2. Can somebody tell me a better way to evaluate for a pair of consecutives word, and for non-consecutive pair of words given the Ns words in between XY word pair???

Thanks

my @wordarray = split /\s+/, $sentence; 
 chomp foreach(@wordarray);
 @wordarray = grep {length} @wordarray; 
        
 for(my $wordindex=0; $wordindex <= $#wordarray; $wordindex ++){
        
    my $word1 = $wordarray[$wordindex]; 
    while ($word1 =~ /(.*?)\/([A-Z]+)/g){
        $valueword1 = $1;
        $keyword1 = $2;
    }
    my $word2 = $wordarray[$wordindex+1]; 
    while ($word2 =~ /(.*?)\/([A-Z]+)/g){
        $valueword2 = $1;
        $keyword2 = $2;
     }

    $patternKey = join("/",$keyword1, $keyword2);
    $patternValue = join(" ",$valueword1, $valueword2);

    print "key $patternKey \n";
        print "value $patternValue \n";
        my $matchingKey = grep($patternKey, @PatternArray);
        ...
[download]

Comment on parsing a sentence? Download Code

Replies are listed 'Best First'.
Re: parsing a sentence? by BrowserUk (Patriarch) on Mar 01, 2004 at 23:40 UTC
This might work for you #! perl -slw use strict; sub mapPairsN (&@) { my( $code, $gap ) = (shift, shift); map{ local( $a, $b ) = @_[ $_, $_ + $gap ]; $code->() } 0.. $#_ - $gap; }; my @words = qw[ The quick brown fox jumps over the lazy dog and then fell head long onto the pond ]; print "\nPairs with gap 1\n", mapPairsN{ "$a/$b\n" } 1, @words; print "\nPairs with gap 3\n", mapPairsN{ "$a/$b\n" } 3, @words; print "\nPairs with gap 5\n", mapPairsN{ "$a/$b\n" } 5, @words; __END__ P:\test>333094 Pairs with gap 1 The/quick quick/brown brown/fox fox/jumps jumps/over over/the the/lazy lazy/dog dog/and and/then then/fell fell/head head/long long/onto onto/the the/pond Pairs with gap 3 The/fox quick/jumps brown/over fox/the jumps/lazy over/dog the/and lazy/then dog/fell and/head then/long fell/onto head/the long/pond Pairs with gap 5 The/over quick/the brown/lazy fox/dog jumps/and over/then the/fell lazy/head dog/long and/onto then/the fell/pond [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail	[reply] [d/l]
Re: Re: parsing a sentence? by Anonymous Monk on Mar 02, 2004 at 00:13 UTC
Thanks Monk for answering my second question! Now I only need to combine both approaches using tagged/words and words/gaps Let's back to work ;) Thanks	[reply]
Re: parsing a sentence? by Roger (Parson) on Mar 01, 2004 at 23:24 UTC
`my $str = "It/PRP really/RB does/VBZ seem/VB to/TO violate/VB a/DT lot +/NN of/IN boundaries/NNS"; my @pairs = map { [ split /\// ] } split /\s+/, $str; for (0..$#pairs-1) { print $pairs[$_][1], "/", $pairs[$_+1][1], "\n"; print $pairs[$_][0], " ", $pairs[$_+1][0], "\n"; }` [download]	[reply] [d/l]
Re: Re: parsing a sentence? by Anonymous Monk on Mar 01, 2004 at 23:39 UTC
Thank you Roger I tried and it works, well it is not repeating the last word pair, but I need to keep the format I have, I cannot take anything that is before or after the / , it has to be words, number.. now it is taking puntuaction marks :(. Let me adapt it.	[reply]
Re: Re: Re: parsing a sentence? by Roger (Parson) on Mar 01, 2004 at 23:48 UTC
You mean something like this? `my $str = "It/PRP really/RB does/VBZ seem/VB to/TO violate/VB a/DT lot +/NN of/IN boundaries!/NNS"; my @pairs = map { [ m!(\w+).*?/(\w+)! ] } split /\s+/, $str; for (0..$#pairs-1) { print $pairs[$_][0], " ", $pairs[$_+1][0], "\n"; }` [download] To handle non-consecutive pairs of words: `$nth = 2; # every 2nd words for (0..$#pairs-$nth) { print $pairs[$_][0], " ", $pairs[$_+$nth][0], "\n"; }` [download]	[reply] [d/l] [select]
Re: Re: Re: Re: parsing a sentence? by Anonymous Monk on Mar 02, 2004 at 00:07 UTC
Re: Re: Re: Re: parsing a sentence? by Anonymous Monk on Mar 02, 2004 at 01:05 UTC