in reply to RE on lines read from in-memory scalar is very slow

I ran the code for a file that contains my e-mail Trash folder from 2018. Its size is 998_621_862 bytes and 14_484_494 lines.

2.474057 read lines from disk and do RE. 2.446446 read lines from in-memory file and do RE.

The numbers are quite similar for both 5.26.1 and 5.39.6 (on Linux).

How large is your machine's memory?

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Replies are listed 'Best First'.
Re^2: RE on lines read from in-memory scalar is very slow
by Danny (Chaplain) on Jan 22, 2024 at 22:22 UTC
    32 gigs of RAM. The file on disk is only 34M. My perl is 5.36.3 on cygwin.

    I just ran it on another system (Linux) on the same file and got:
    0.457707 read lines from disk and do RE. 0.675398 read lines from in-memory file and do RE.
    So it seems to be specific to my system. It doesn't seem to be memory as it only uses around 34M RAM when running on that file and my memory is nowhere near exhausted.

      > My perl is 5.36.3 on cygwin ... I just ran it on another system (Linux) on the same file ...
      > and got 0.675398 read lines from in-memory file and do RE (vs 48.796606 on cygwin)

      Wow, that is such a massive difference, I suspect something's seriously wrong with your cygwin perl v5.36.3 environment ... so I suggest you run your test program on cygwin with various tools to try to understand why on earth it's taking so long. Some tools you might try:

      Other suggestions welcome. I suspect a system level tool (such as VTune) is the best bet in this case, hopefully it will reveal something glaringly wrong (as it did for me here).

      For future reference, posting a test program using Benchmark, that anyone can run with zero effort, is the ideal way to get help on these sorts of performance problems (see Fastest way to lookup a point in a set (update: and Re: Confused by RegEx count by choroba) for examples).

      See also: Code Optimization and Performance References (and Memory Tools References)

      👁️🍾👍🦟