in reply to Re: What is the most efficient way to split a long string (see body for details/constraints)?
in thread What is the most efficient way to split a long string (see body for details/constraints)?

I wonder why the file handle approach (fh) is slower here, when it was faster in this test:

Re: Is foreach split Optimized?

Benchmark from post linked above:

Test Code

filehandle => sub { my @lines; open my $str_fh, "<", \$str or die "cannot open fh $!"; while (<$str_fh>) { chomp; s/o/i/g; push @lines, $_; } },

Results (Perl 5.26, {Windows,Linux,MacOS,???}?)

perlbrew exec bench_script.pl # Other versions of Perl, omitted for brevity - see original link # for the "gories..." ... perl-5.26.0 ========== Rate index regex split filehandle index 3.00/s -- -25% -49% -53% regex 3.98/s 33% -- -33% -37% split 5.91/s 97% 49% -- -6% filehandle 6.31/s 111% 59% 7% --

Test using Perl 5.30 / 10 secs per test

Here is what I get using Perl 5.30 with 10 seconds per iteration in order to facilitate more accuracy (Linux Kubuntu-VM 5.1.10-050110-generic #201906151034 SMP Sat Jun 15 10:36:59 UTC 2019 x86_64 GNU/Linux):

Note #1: Keep in mind that this was run on a Linux VM on a Windows 10 host

Note #2: As you can see from the rates, it is a really Really REALLY fast host, where "really fast" implies "World Record Holder" of sorts fast

perl-5.30.0 =========== Rate regex index split filehandle regex 9.98/s -- -11% -32% -33% index 11.2/s 12% -- -23% -25% split 14.6/s 46% 30% -- -2% filehandle 14.9/s 49% 33% 2% --

Note #3: Decided to run it on the Windows 10 host itself, but unfortunately I only have Perl 5.28.1 installed, so it's not an apple to apple comparison with above:

perl-5.28.1 (Windows 10 Pro) ============================ Rate regex index filehandle split regex 8.09/s -- -14% -17% -33% index 9.46/s 17% -- -3% -22% filehandle 9.73/s 20% 3% -- -20% split 12.2/s 50% 29% 25% --

Note #4: Surprisingly, the Linux version in a VM ran faster than the native Windows version. I attribute this to either a difference between the Perl versions or a bad build on the Windows side

Note #5: Found the culprit on why it's slower on the Windows side (gcc was used instead of Visual C++):

perl -V ======= ==> cc='gcc' ccflags =' -s -O2 -DWIN32 -DWIN64 -DCONSERVATIVE -D__USE_MINGW_ANS +I_STDIO -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLIC +IT_SYS -DUSE_PERLIO -fwrapv -fno-strict-aliasing -mms-bitfields' + optimize='-s -O2' cppflags='-DWIN32' ccversion='' gccversion='7.1.0' ... Built under MSWin32 Compiled at Dec 2 2018 14:30:03 @INC: C:/Strawberry/perl/site/lib C:/Strawberry/perl/vendor/lib C:/Strawberry/perl/lib
  • Comment on Re^2: What is the most efficient way to split a long string (see body for details/constraints)?
  • Select or Download Code

Replies are listed 'Best First'.
Re^3: What is the most efficient way to split a long string (see body for details/constraints)?
by haukex (Archbishop) on Jun 21, 2019 at 22:31 UTC
    I wonder why the file handle approach (fh) is slower here, when it was faster in this test: Re: Is foreach split Optimized?

    If I had to wager a guess, it might be because in this thread, the benchmark is setting up a new filehandle on every line of input, while in the other thread, it's only a single filehandle.

    By the way, in regards to speeding things up by compiling them, this might be a case where a script written for Will_the_Chill's RPerl could give performance benefits.