rael438 has asked for the wisdom of the Perl Monks concerning the following question:

I've been playing with the lingua link parser. It looks like the author for the perl mod has included overloading as suggested by this site http://www.foo.be/docs/tpj/issues/vol5_3/tpj0503-0010.html I pulled some code from here http://improvist.org/Projects/Technology/Papers/LinkParser/tpj0503-0010a.html and I'm having trouble getting it to work. My OO knowledge is iffy.. Feel like more of a procedural guy. What am I doing wrong? I can't seem to get any output from $linkage.
#!/usr/bin/perl -w # For this to work, the overload parameter in ::Linkage and # ::Sublinkage must point to "new_as_string". use Lingua::LinkParser; use strict; my $parser = new Lingua::LinkParser; $parser->opts('disjunct_cost' => 2); $parser->opts('linkage_limit' => 101); while (1) { print "Enter a sentence> "; my $input = <STDIN>; my $sentence = $parser->create_sentence($input); my $linkage = $sentence->linkage(1); # computing the union and then using the last sublinkage # permits conjunctions. $linkage->compute_union; my $sublinkage = $linkage->sublinkage($linkage->num_sublinkages); my $what_rocks = 'S[s|p]' . # match the link label '(?:[\w\*]{1,2})*'.# match any optional subscrip +ts '\:(\d+)\:' . # match number of the word '(\w+(?:\.\w)*)'; # match and save the word its +elf my $other_stuff = '[^\)]+'; # match other stuff within pa +renthesis my $rocks = '\"(rock[s|ed]*).v\"'; # match and store verb my $no_objects = '[^(?:O.{1,2}\:' . # don't match objects '\d+\:\w+(?:\.\w)*)]*\)'; my $pattern = "$what_rocks $other_stuff $rocks $no_objects"; if ( $sublinkage =~ /$pattern/mx ) { my $wordobj = $sublinkage->word($1); my $wordtxt = $2; my $verb = $3; my @wordlist = (); # we could put all of the below functionality in the regex abo +ve. foreach my $link ($wordobj->links) { # proper nouns, noun modifiers, pre-noun adjectives if ($link->linklabel =~ /^G|AN|A/) { $wordlist[$link->linkposition] = $link->linkword; } # possessive pronouns, via a noun determiner if ($link->linklabel =~ /^D[s|m]/) { my $wword = $sublinkage->word($link->linkposition); foreach my $llink ($wword->links) { if ($llink->linklabel =~ /^YS/) { $wordlist[$llink->linkposition] = $llink->link +word; $wordlist[$link->linkposition] = $link->linkw +ord; my $wwword = $sublinkage->word($llink->linkpos +ition); foreach my $lllink ($wwword->links) { if ($lllink->linklabel =~ /^G|AN/) { $wordlist[$lllink->linkposition] = $ll +link->linkword; } } } } } } print " -> ", join (" ", @wordlist, $wordtxt); } }

Replies are listed 'Best First'.
Re: LinkParser new_as_string
by Tanktalus (Canon) on Jun 28, 2006 at 04:16 UTC

    I'm not following the entire piece of code, but I'd like to point out that that big built-up regex probably isn't what you want. You may want to think about building it up with the qr operator, that way you can embed your comments much more cleanly, e.g.:

    my $pattern = qr/ S[s|p] # match the link label (?:[\w\*]{1,2})* # match any optional subscripts :(\d+): # match number of the word (\w+(?:\.\w)*) # match and save the word itself [^\)]+ # match other stuff within parenthesis "(rock[s|ed]*).v"# match and store verb [^(?:O.{1,2}: # don't match objects \d+\:\w+(?:\.\w)*)]*\) /xm;
    And then I suggest running the resultant object through something like YAPE::Regex::Explain. I don't think this is what you really want.

    e.g., [s|p] matches s, |, or p. Not "s or p". Or what you have for "don't match objects" displays a common misconception about how the square brackets work. The Y::R::E tool will help you understand what you have there.

    Hope this helps - good luck!