Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Hi, and thanks for taking your time. I am new to Perl, and am trying to write a search engine. Now that I got the output to print I would just like to sort it a little better. The output currently looks like:
NSUBJ
_nsubj_ of move is: ***** it *****
MATCH #1 Sent. 60 The type of body cavity an animal has strongly influences how --it-- can **move** .
_nsubj_ of move is: ***** animals *****
MATCH #1 Sent. 88
These --animals-- **move** slowly or not at all .
MATCH #2 Sent. 89
Bilateral symmetry is a common characteristic of --animals-- that **move** freely through their environments .
I want to print the _nsubj_ of move is: ***** animals ***** first because it has more Matches. HOW would you go about doing this?
Here is the print part of the code:
The way I thought to go about it is shown in the code comments, #5#..., Thanks again! This is my first time using this site, so if this format isn't good, let me know please!## EDIT: Now the EVEN number is gramfunc, ODD number is sentence my @allgramfunc; ## list of unique grammar func my @allmatches; ##use for headings and all matches (sentences) under t +hat heading as one scalar my @sortedallgramfunc; ## What order for capital heading? => alphabeti +cal my @sortedheadmatches; ## What order for dependency heading? => freque +ncy Depend on firstword? my @sortedfirstmatches; ## To keep order of sentences with headmatches my @sortedsecondmatches; ## To keep order of sentences with headmatche +s my %seenmatches = (); my %seens = (); #my @pluralfirstmatches = @firstmatches; #my @pluralsecondmatches = @secondmatches; #CREATE an array of all the grammar functions: for (my $j=0; $j <= @grammatches; $j++) { ## Could be normal $j++ if u +se another variable instead of @matches for both if ( defined( $grammatches[$j] )) { ## Just to avoid error message push (@allgramfunc, "$grammatches[$j]") unless ($seengramfunc{ + $grammatches[$j] }++); } } #SORT overheadings by alphabetical @sortedallgramfunc = sort { lc($a) cmp lc($b) } @allgramfunc; #PRINT all the sentences which are related to searchkey by the same gr +amfunc foreach my $sortedallgramfunc (@sortedallgramfunc) { print ("\n",uc $sortedallgramfunc,"\n\n");# Which gramfunc is bein +g shown? #@sortedheadmatches = sort @headmatches; for (my $l=0; $l <= @headmatches; $l++) { if (defined( $headmatches[$l] ) and $headmatches[$l] =~ /$sort +edallgramfunc/) { #2#$pluralfirstmatches[$l] =~ s/$firstmatches[$l]$pluralsu +ffix/$firstmatches[$l]/ig; unless ($seenmatches{ $headmatches[$l] }++) { print $headmatches[$l]; my $count = 1; for (my $m=0; $m <= @sentmatches; $m++) { if ( defined( $sentmatches[$m]) and $sentmatches[$ +m] =~ /\s\S\S$firstmatches[$l]\S\S\s/ and $sentmatches[$m] =~ /\s\S\S +$secondmatches[$l]\S\S\s/) { ##We know $l and $m are matching #5# Try sorting by creating array that includes he +ader and all sentences as a scalar, then by size, maybe join until hi +t _dobj sort length $a cmp length $b maybe "length first then alphabe +tical" => or $a cmp $b print "MATCH #$count $sentmatches[$m]"; # unle +ss $seens{ $sentmatches[$m] }++); $count++; } } } } } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Search Engine Output needs sorting
by jwkrahn (Abbot) on May 28, 2011 at 06:30 UTC | |
by Anonymous Monk on May 28, 2011 at 13:04 UTC | |
by johngg (Canon) on May 28, 2011 at 13:17 UTC | |
by Anonymous Monk on May 28, 2011 at 14:04 UTC | |
by Anonymous Monk on May 28, 2011 at 14:12 UTC | |
by johngg (Canon) on May 28, 2011 at 14:44 UTC | |
by Anonymous Monk on May 28, 2011 at 18:09 UTC | |
by MidLifeXis (Monsignor) on May 31, 2011 at 13:28 UTC | |
|
Re: Search Engine Output needs sorting
by Anonymous Monk on May 28, 2011 at 13:31 UTC | |
by Anonymous Monk on May 29, 2011 at 08:09 UTC |