G'day shamat,
Firstly, my comments (some of which have already been mentioned in earlier responses):
-
Do you really want to compare all fragments with each other? I can envisage a situation where you're attempting to decide whether "... est ..." matches "... in ...". Perhaps you'd want to filter badly damaged fragments from any sort of matching whatsoever.
-
I think you'd be better off comparing the fragments with a single reference string. You wrote "... some of them being partly damaged.", so presumably some of them are complete.
-
You wrote "... only the last string should not match ..." (that would be "quattuor"). If that's the case, "Gallia" should probably be "Gallia ..."
-
The output you show does not match the code that creates it. From the code you posted, I'd be expecting output like:
N-M: [string1] and [string2] DO NOT MATCH!
Here's a solution that takes all of the above into account:
#!/usr/bin/env perl
use strict;
use warnings;
my @exemplars = <DATA>;
my $reference = shift @exemplars;
print "Reference string: $reference";
for (@exemplars) {
my $exemplar = $_;
s/[.]{3}/.+?/g;
if ($reference !~ /$_/) {
print "NO MATCH: $exemplar";
}
}
__DATA__
Gallia est omnis divisa in partes tres
Gallia est omnis divisa in ...
Gallia est omnis ...
Gallia
... omnis divisa in ...
Gallia est ... tres
Gallia ... partes tres
Gallia est ... partes tres
Gallia ... divisa ... tres
... tres
quattuor
Gallia ...
Output:
$ pm_latin_fragments.pl
Reference string: Gallia est omnis divisa in partes tres
NO MATCH: Gallia
NO MATCH: quattuor
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.