Note that hippo's addition of the g modifier brings up the issue that I raised earlier about having more than one character difference between two words. Here's what happens when I add a couple more examples to hippo's verson:
#!/usr/bin/perl use strict; use warnings; my @words = <DATA>; chomp @words; while ( @words >= 2 ) { my $model = my $regex = shift @words; if ( $regex =~ s/(.*?)[ab](.*?)/$1\[ab\]$2/g ) { my @hits = grep /^$regex$/, @words; if ( @hits ) { print join( " ", $model, "matches", @hits, "using", $regex +, "\n" ); } } } __DATA__ lama lamb able bale
Output:
lama matches lamb using l[ab]m[ab] able matches bale using [ab][ab]le
The output shows how the g modifier affects the creation of the regex to be used for searching the array; without it, the first regex would be l[ab]ma (which would not match "lamb"), and the next would be l[ab]mb (which would not match "lama" if it were to show up later in the list).

But when using the g modifier, the search pattern for "able" and "bale" come out the same, and they match each other, because the regex [ab][ab]le allows up to two characters to differ.

To solve that, you could to compare the current "model" word against each of the matches from the array, using the tr/// operator as described in previous replies, to see how many characters are different in each paired set of words, and keep only those matches that differ by a single character.

(UPDATE: It's also worth noting that using g this way is effectively equivalent to using "split", "map" and "join" to build the multi-match regex, like I showed in this previous reply - which just goes to show that "there's more than one way to do it."


In reply to Re^5: Comparing Lines within a Word List by graff
in thread Comparing Lines within a Word List by dominick_t

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.