Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Oh venerable monks,

being merely a self-taught and clueless linguist trying to get perl to work with my text corpus, I have come across the following problem: My corpus consists of annotated text, where each line of text is followed by a line of part-of-speech terms, each word in the text line corresponding to exactly one term in the part-of-speech line.

Eventually, I would like to be able to search for a given item in a text line, followed by an item which is glossed as, say "n" in the pos-line.

So far, I have created two two-dimensional arrays: each text line is an array of words and the totality of the text lines is a superordinate array; thus, each word is assigned the number of its line and its number within the line as in $word[linenumber][wordnumber]. And the same goes for the pos elements.

The trouble is that I don't know how to articulate a condition like "If $word of an arbitrary $linenumber and $wordnumber matches "x" and $pos of THE SAME $linenumber and $wordnumber matches "y" then print $line of $linenumber."

My own humble attempts have gone no further than this:
open(my $in, "<", "Texts.txt") or die "impossible: $!"; #$morph=0; LINE: while (<$in>){ if(/\\tx/){ push (@tx, $_); push @txtoken, [split]; } if(/\\ps/){ push (@ps, $_); push @pstoken, [split]; } } foreach($txtoken[$j][$i] = "temeli" and $pstoken[$j][$i]= +"n"){ print "$tx[$j]\n" ; } }

Replies are listed 'Best First'.
Re: Matching arbitrary keys across arrays
by markhh (Novice) on Dec 09, 2010 at 20:53 UTC
    Sounds like you could benefit from building some hashes to act as indexes into your two arrays. For the specific example you give, the key of the hash would be the $word and the value would be an array-of-arrays with $linenum $wordnum pairs. or if you don't want to do that
    foreach my $i (0 .. @txttoken) { my $txtline = $txttoken[$i]; foreach my $j (0 .. @$txtline) { if ($txtline->[$j] eq "temeli" and $pstoken[$i][$j] eq "n") { do_something(); } } }
Re: Matching arbitrary keys across arrays
by kennethk (Abbot) on Dec 09, 2010 at 20:43 UTC
    An input file is worth a thousand words - I think I understand your description of the problem, but would be a lot more confident if I could read Texts.txt. See How do I post a question effectively?.

    You potentially have some issues because you are relying on synchronized arrays - these can result in headaches when one array is updated but another is not. But proselytizing aside, if you'd like to scan through what I am assuming is your structure, you can do so with a pair of nested for loops, like

    foreach my $i (0 .. $#txtoken) { foreach my $j (0 .. $#{txtoken[$i]}) { if ($txtoken[$i][$j] eq "temeli" and $pstoken[$i][$j] eq "n") +{ print "$tx[$i]\n" ; } } }

    Note I've used a string-comparison operator eq to check for string equivalence: eq, == and = mean different things.

      Hehe. You type faster.