Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks
I am taking a sentence from a tagged text and splitting it in words/tag (It/PRP). Them I am taking the sequence of two pairs of words, and tags (separated) to compare the tag with some pattern.
My problem is the way I code it (clueless), the last word of the sentence repeats (no word to combine), e.g.

This is my input: It/PRP really/RB does/VBZ seem/VB to/TO violate/VB a/DT lot/NN of/IN boundaries/NNS
this is my wrong output:

PRP/RB
It really
RB/VBZ
really does
...
IN/NNS
of boundaries
NNS/NNS
boundaries boundaries

1. Please,.. Can somebody help me to solve this?!!

2. Can somebody tell me a better way to evaluate for a pair of consecutives word, and for non-consecutive pair of words given the Ns words in between XY word pair???

Thanks
my @wordarray = split /\s+/, $sentence; chomp foreach(@wordarray); @wordarray = grep {length} @wordarray; for(my $wordindex=0; $wordindex <= $#wordarray; $wordindex ++){ my $word1 = $wordarray[$wordindex]; while ($word1 =~ /(.*?)\/([A-Z]+)/g){ $valueword1 = $1; $keyword1 = $2; } my $word2 = $wordarray[$wordindex+1]; while ($word2 =~ /(.*?)\/([A-Z]+)/g){ $valueword2 = $1; $keyword2 = $2; } $patternKey = join("/",$keyword1, $keyword2); $patternValue = join(" ",$valueword1, $valueword2); print "key $patternKey \n"; print "value $patternValue \n"; my $matchingKey = grep($patternKey, @PatternArray); ...

Replies are listed 'Best First'.
Re: parsing a sentence?
by BrowserUk (Patriarch) on Mar 01, 2004 at 23:40 UTC

    This might work for you

    #! perl -slw use strict; sub mapPairsN (&@) { my( $code, $gap ) = (shift, shift); map{ local( $a, $b ) = @_[ $_, $_ + $gap ]; $code->() } 0.. $#_ - $gap; }; my @words = qw[ The quick brown fox jumps over the lazy dog and then fell head long onto the pond ]; print "\nPairs with gap 1\n", mapPairsN{ "$a/$b\n" } 1, @words; print "\nPairs with gap 3\n", mapPairsN{ "$a/$b\n" } 3, @words; print "\nPairs with gap 5\n", mapPairsN{ "$a/$b\n" } 5, @words; __END__ P:\test>333094 Pairs with gap 1 The/quick quick/brown brown/fox fox/jumps jumps/over over/the the/lazy lazy/dog dog/and and/then then/fell fell/head head/long long/onto onto/the the/pond Pairs with gap 3 The/fox quick/jumps brown/over fox/the jumps/lazy over/dog the/and lazy/then dog/fell and/head then/long fell/onto head/the long/pond Pairs with gap 5 The/over quick/the brown/lazy fox/dog jumps/and over/then the/fell lazy/head dog/long and/onto then/the fell/pond

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      Thanks Monk for answering my second question!
      Now I only need to combine both approaches using tagged/words and words/gaps
      Let's back to work ;)
      Thanks
Re: parsing a sentence?
by Roger (Parson) on Mar 01, 2004 at 23:24 UTC
    my $str = "It/PRP really/RB does/VBZ seem/VB to/TO violate/VB a/DT lot +/NN of/IN boundaries/NNS"; my @pairs = map { [ split /\// ] } split /\s+/, $str; for (0..$#pairs-1) { print $pairs[$_][1], "/", $pairs[$_+1][1], "\n"; print $pairs[$_][0], " ", $pairs[$_+1][0], "\n"; }

      Thank you Roger

      I tried and it works, well it is not repeating the last word pair, but I need to keep the format I have, I cannot take anything that is before or after the / , it has to be words, number.. now it is taking puntuaction marks :(. Let me adapt it.
        You mean something like this?
        my $str = "It/PRP really/RB does/VBZ seem/VB to/TO violate/VB a/DT lot +/NN of/IN boundaries!/NNS"; my @pairs = map { [ m!(\w+).*?/(\w+)! ] } split /\s+/, $str; for (0..$#pairs-1) { print $pairs[$_][0], " ", $pairs[$_+1][0], "\n"; }

        To handle non-consecutive pairs of words:
        $nth = 2; # every 2nd words for (0..$#pairs-$nth) { print $pairs[$_][0], " ", $pairs[$_+$nth][0], "\n"; }