Without looking into it any deeper, a simple approach is to stop recalculating the "cleaned" version of a string for each comparison. For that, you will need to move the cleaning out of the function compare and up in to the loop calling compare:

for my $string1 (@strings_left) { my $string1_clean = clean( $string1 ); for my $string2 (@strings_right) { if( compare($string1_clean, clean( $string2 )) { ... } }; };

If you have some more memory, you can Memoize the cleanup of the string. This would speed up cleaning up strings a bit more.

But maybe you can save more comparison time by first sorting all your strings into buckets based on the first (few) characters of the string. There is no way that a string starting with "A" will be equal to a string starting with "B". That could cut down on the total number of comparisons made.


In reply to Re: Ignoring patterns when comparing strings by Corion
in thread Ignoring patterns when comparing strings by TravelAddict

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.