Let's see here. Your small string has 140 characters and your reference string has 3114. If you were just comparing exactly 10-character substrings that's 131x3105 which is 406755 comparisons. If you then make that >10 character substrings it becomes a lot heavier. eg. for 11, it's 130x3104 which is 403520 to be added on to your original 406755. Build that all the way up to 140 and it comes out as 26,471,170. Then you have the possibility that any one of those characters could be a non-match, so for a substring of length n that's n more matches. That gives a total by my calculations of 1,403,490,770 - and this is just the comparisons, nevermind all the slicing and dicing to achieve them.

Does your task have significant time constraints? If so, you'll have to start looking into some algorithm references to be a lot smarter about this, I fear.

Would I , for instance, start like this? if($reference_str=~/$small_str{10,}/)

No, that would be 10 or more repititions of the entirety of $small_str with 10 or more instances of its last character at the end. (thanks haukex for pointing out my initial error).


In reply to Re: How do you match a stretch of at least N characters by hippo
in thread How do you match a stretch of at least N characters by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.