Note, that when I just loop through the lines without doing the RE the disk and in-memory files take about the same amount of time, so it seems it is the RE that is causing the problem.

That suggests to me that the data seen in the first and second loops are not the same. If you put a counter in the loops, do they both see the same number of lines?

In particular if you are running this on Windows, I believe it may do some automatic conversion of CRLF line endings on read, in which case building up $s in this way may result in something that would read the whole in-memory file as a single line.

You might try the following approach to set up the in-memory file, which would be more efficient and may be more certain to give the same content:

seek $fh, 0, 0; my $s = do { local $/; <$fh>; };

This reads the full content of the file in one go, rather than reading it line by line and then appending. (See also what's faster than .= for more detail on why building up a string piecewise can be really inefficient.)


In reply to Re: RE on lines read from in-memory scalar is very slow by hv
in thread RE on lines read from in-memory scalar is very slow by Danny

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.