Why do people always suppose you'll commonly get positive results near the start of the loop count?

What I was supposing was that disk access is more expensive than anything else, so the best algorithm would be one that minimizes it. I supposed also that if the files differ, it will nearly always be before the end of the files, so stopping at the difference will virtually always mean avoiding some expensive reading.

Well, I went off and did some testing, and it turns out that a quick Digest::MD5 of each file was about three times faster than reading line-by-line with the loop I posted earlier—that's for identical files.

(I tested with a file of 10_000_000 lines, each with random alphanumeric data 10–1_000 characters long.)

Of course, if there really is an early give-away, spotting it early and aborting all that reading really does give a big advantage. In this case, though, I probably wouldn't expect that to happen often enough to be worth it.


In reply to Re^4: quick and safe way to deal with this? by kyle
in thread quick and safe way to deal with this? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.