remluvr has asked for the wisdom of the Perl Monks concerning the following question:
Hi everyone. Here I am with a new problem I can't solve. I have two input files. One contains a list of semantic relations structured like the following (lets' call it INPUT1):
alligator-n amphibian_reptile attri long-j alligator-n amphibian_reptile attri old-j alligator-n amphibian_reptile coord crocodile-n alligator-n amphibian_reptile coord frog-n alligator-n amphibian_reptile event walk-v alligator-n amphibian_reptile hyper animal-n
And another one that is like the following (obviously the following is just a very reduced version):
frog-n about adage-n 8.8016 frog-n appearance-1 broad-j 11.9640 frog-n coord albino-n 6.7667 frog-n be jumper-n 6.0272 frog-n be key-n 3.8779 frog-n of body-n 8.3063 frog-n of bone-n 20.7982 frog-n of book-n 0.4229 crocodile-n be key-n 3.2572 crocodile-n of chorus-n 24.9515 crocodile-n of book-n 2.3460 crocodile-n obj sit-v 3.1857 crocodile-n obj size-v 57.3257 crocodile-n obj skewer-v 6.1105 animal-n coord-1 investigation-n 0.9666 animal-n coord-1 irrigation-n 2.6058 animal-n coord-1 isolation-n 1.4074 animal-n coord-1 isotope-n 2.7420
I need to check input1 for relations eq "coord" (third field of the rows) and search input2 for occurrences of fourth field of the row element in it. In this case I have crocodile-n and frog-n. I have to build another file that looks like input2 but contains every row whose first field is crocodile-n or frog-n. If one element is already found, I need not to repeat it, but sum the score it has with the one I already found. I understand this explanation is not really clear, so here it is an example of desired output:
not_alligator-n about adage-n 8.8016 not_alligator-n appearance-1 broad-j 11.9640 not_alligator-n coord albino-n 6.7667 not_alligator-n be jumper-n 6.0272 not_alligator-n be key-n 7.1351(3.8779+3.2572) not_alligator-n of body-n 8.3063 not_alligator-n of chorus-n 24.9515 not_alligator-n of bone-n 20.7982 not_alligator-n of book-n 2.7689(0.4229+2.3460) not_alligator-n obj sit-v 3.1857 not_alligator-n obj size-v 57.3257 not_alligator-n obj skewer-v 6.1105
I have no idea where to start. Less than one month since I started back using perl, and still a lot I have to learn Every suggestion, tip, indication on what to do would be really appreciated I need it because I'm analyzing some statistical measure to be used on semantic relation for my ph.D Theses. Thanks to all Giulia
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Select only desired features from a text
by JavaFan (Canon) on Mar 19, 2012 at 15:44 UTC | |
| |
|
Re: Select only desired features from a text
by moritz (Cardinal) on Mar 19, 2012 at 13:10 UTC | |
by remluvr (Sexton) on Mar 19, 2012 at 15:15 UTC | |
by moritz (Cardinal) on Mar 19, 2012 at 18:03 UTC | |
by bitingduck (Deacon) on Mar 19, 2012 at 15:29 UTC | |
|
Re: Select only desired features from a text
by aaron_baugher (Curate) on Mar 19, 2012 at 17:01 UTC | |
by Anonymous Monk on Mar 19, 2012 at 22:19 UTC | |
|
Re: Select only desired features from a text
by RichardK (Parson) on Mar 19, 2012 at 12:24 UTC | |
by remluvr (Sexton) on Mar 19, 2012 at 14:12 UTC |