Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks I have my code as below which does not give me the values of $value :
use WordNet::QueryData; use WordNet::Similarity::path; use strict; my $wn = WordNet::QueryData->new; my $measure = WordNet::Similarity::path->new ($wn); my $Infile1 = shift; my $Infile2 = shift; open (INPUT, "$Infile1") || die "can't open the input file1"; chomp (my @words1 = <INPUT>); close (INPUT) ; open (INPUT, "$Infile2") || die "can't open the input file2"; chomp (my @words2 = <INPUT>); close (INPUT) ; for my $i (0 .. $#words1) { if ($words1[$i] =~ /\#n$/){ for my $j ( 0 .. $#words2) { if ($words2[$j] =~ /\#n$/){ my $value = $measure->getRelatedness("$words1[$i]#1", "$w +ords2[$j]#1"); print "similarity of $words1[$i] and $words2[$j] = $value\n";} }}}
while when I use the word itself in code as below:
$measure->getRelatedness("cat#n#1", "dog#n#1");
I simply reach the output. my input is as below:
July#n Macdonough#n patrol#v water#n Guam#n protect#v assault#n craft#n enemy#n submarine#n continue#v role#n depart#v Hawaii#n August#n day#n Pearl#n Harbor#n depart#v Admiralty#n Islands#n
how shall I modify my code to get the value while using the $words1$i variable? note that my code get the input and give this as output :
similarity of July#n and Macdonough#n = similarity of July#n and Hawaii#n = similarity of July#n and August#n = similarity of Macdonough#n and Macdonough#n = . . .
Thanks in advance.

Replies are listed 'Best First'.
Re: String Variables ....
by toolic (Bishop) on Jan 23, 2009 at 15:29 UTC
    I have never used these modules, and I don't have them installed, but the documentation for WordNet::Similarity indicates that you might be able to get information about potential errors that have occurred:
    my ($error, $errorString) = $measure->getError(); die $errorString if $error;
Re: String Variables ....
by jethro (Monsignor) on Jan 23, 2009 at 15:31 UTC

    Strange, I changed your code slightly to allow a test run without the Wordnet modules and it works, see below. It might be a problem with WordNet

    #!/usr/bin/perl use warnings; use strict; my $Infile1 = shift; my $Infile2 = shift; open (INPUT, "$Infile1") || die "can't open the input file1"; chomp (my @words1 = <INPUT>); close (INPUT) ; open (INPUT, "$Infile2") || die "can't open the input file2"; chomp (my @words2 = <INPUT>); close (INPUT) ; for my $i (0 .. $#words1) { if ($words1[$i] =~ /\#n$/){ for my $j ( 0 .. $#words2) { if ($words2[$j] =~ /\#n$/){ my $value = getRelatedness("$words1[$i]#1", "$words2[$j]# +1"); print "similarity of $words1[$i] and $words2[$j] = $value\n";} }}} sub getRelatedness { print join(':',@_),"\n"; return 4; } OUTPUT: July#n#1:July#n#1 similarity of July#n and July#n = 4 July#n#1:Macdonough#n#1 similarity of July#n and Macdonough#n = 4 Macdonough#n#1:July#n#1 similarity of Macdonough#n and July#n = 4 Macdonough#n#1:Macdonough#n#1 similarity of Macdonough#n and Macdonough#n = 4
      could not be because of wordnet since when i put the string in getrelatedness it gives me correct answer

        So... if:

        my $value = $measure->getRelatedness("cat#n#1", "dog#n#1");
        works, but:
        my $value = $measure->getRelatedness("$words1[$i]#1", "$words2[$j]#1 +");
        does not, then whatever "$words1[$i]#1" and "$words2[$j]#1" yield, it isn't what you think it is. The trick is to check, carefully, looking out for odd things like stray CR or even LF characters... I suggest:
        sub show { my ($s) = @_ ; $s =~ s/([\x00-\x1F\x7F-\xA0])/sprintf("\\x%02X", ord($1))/eg ; return $s ; } ;
        and then add replace your print statement with:
        print "similarity of '", show("$words1[$i]#1"), "' and '", show("$wo +rds2[$j]#1"), "' = $value\n";
        and check whether anything "odd" shows up.

Re: String Variables ....
by Bloodnok (Vicar) on Jan 23, 2009 at 14:57 UTC
    Only guessing, but have you tried ... my $value = $measure->getRelatedness($words1[$i] . q/#1/, $words2[$j] . q/#1/);?

    A user level that continues to overstate my experience :-))
      yes I tried it before but does not work, indeed I even try to make my input file in word#n#1 format but still using the #words1$i does not work.