I ran the code using use re 'debug' and there is no difference in the regex processing.

I then instrumented the code with some metamod::Devel::Peek Dumps. The in-memory strings have rapidly increasing amounts of memory allocated (the LEN field), plateauing at close to the size of the input string. This pattern is the same for both Strawberry Perl and MSYS2 Perl, which makes me wonder if the delay is related to memory management. Others are more qualified to comment on that front than me, though.

Edit: Just for completeness I also tested using Perl 5.36.0 on Ubuntu via WSL and the memory usage is the same.

Updated code is below behind inside the readmore tags.

tempfile.txt is size 35 Mb SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x177fa280250 " Querysome random text 0.271320203145251\n"\0 CUR = 41 LEN = 408 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x177fa34f4d0 " Querysome random text 0.775348369818055some ran +dom text 0.775348369818055\n"\0 CUR = 75 LEN = 201 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x177fa3a28b0 " Querysome random text 0.785001144808529\n"\0 CUR = 41 LEN = 43 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x177fa297c60 " Querysome random text 0.894431999356865\n"\0 CUR = 41 LEN = 309 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x177f88a13b0 " Querysome random text 0.412049736815259some ran +dom text 0.412049736815259some random text 0.412049736815259some rand +om text 0.412049736815259some random text 0.412049736815259some rando +m text 0.412049736815259some random text 0.412049736815259some random + text 0.412049736815259some random text 0.412049736815259\n"\0 CUR = 313 LEN = 392 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x177fa3750d0 " Querysome random text 0.809515115277865\n"\0 CUR = 41 LEN = 275 COW_REFCNT = 2 0.005142 read lines from disk and do RE (6 matches). SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x177f88b3ed0 " Querysome random text 0.271320203145251\n"\0 CUR = 41 LEN = 656 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x1778000b070 " Querysome random text 0.775348369818055some ran +dom text 0.775348369818055\n"\0 CUR = 75 LEN = 37784908 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x1778242b070 " Querysome random text 0.785001144808529\n"\0 CUR = 41 LEN = 37784634 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x1778000d070 " Querysome random text 0.894431999356865\n"\0 CUR = 41 LEN = 37784593 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x1778242b070 " Querysome random text 0.412049736815259some ran +dom text 0.412049736815259some random text 0.412049736815259some rand +om text 0.412049736815259some random text 0.412049736815259some rando +m text 0.412049736815259some random text 0.412049736815259some random + text 0.412049736815259some random text 0.412049736815259\n"\0 CUR = 313 LEN = 37784176 COW_REFCNT = 2 SV = PV(0x177f89db120) at 0x177f88bcca0 REFCNT = 1 FLAGS = (POK,IsCOW,pPOK) PV = 0x1778000f070 " Querysome random text 0.809515115277865\n"\0 CUR = 41 LEN = 37783182 COW_REFCNT = 2 0.004073 read lines from in-memory file and do RE (6 matches).
#!/usr/bin/env perl use warnings; use strict; use Time::HiRes qw( time ); use Devel::Peek; my $file = shift @ARGV; my ($fh, $time); if (!$file) { # should use Path::Tiny::tempfile $file = 'tempfile.txt'; open my $ofh, '>', $file or die "Cannot open $file for writing, $! +"; srand(1234567); for my $i (0..200000) { my $string = 'some random text ' . rand(); $string = $string x (1 + int (rand() * 10)); if (rand() < 0.163) { $string = " Query${string}"; } say {$ofh} $string; } $ofh->close or die "Cannot close $file, $!"; printf "%s is size %i Mb\n", $file, (-s $file) / (1028**2); } open $fh, "<", $file; $time = time; my $match_count1; my $i1 = 0; while(<$fh>) { /^ ?Query/ && $match_count1 ++; if (/^ Query/) { Dump $_; $i1 ++; last if $i1 > 5; } } printf "%f read lines from disk and do RE ($match_count1 matches).\n", + time - $time; seek $fh, 0, 0; my $s = ""; while(<$fh>) { $s .= $_; } $fh->close; #Dump $s; print "\n\n"; open $fh, "<", \$s; $time = time; my $match_count2; my $i2 = 0; while(<$fh>) { /^ ?Query/ && $match_count2++; if (/^ Query/) { Dump $_; $i2++; last if $i2 > 5; } } printf "%f read lines from in-memory file and do RE ($match_count2 mat +ches).\n", time - $time;

In reply to Re^4: RE on lines read from in-memory scalar is very slow by swl
in thread RE on lines read from in-memory scalar is very slow by Danny

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.