Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Trying to benchmark SLURP versus other forms of Perl code

Get error "Substitution loop at c:/strawberry/perl/vendor/lib/File/Slurp.pm line 254

Following code works well to 600MB but over this size error occurs in step 3.

#!usr/bin/perl use strict; use File::Slurp qw( :all); use Time::HiRes qw(gettimeofday tv_interval); # get arguments for process my $tmv0 = [gettimeofday]; my $fl_in = $ARGV[0]; # reading records... step 1 open my $in_fh, '<', $fl_in or die $!; my $s1=<$in_fh>; print "$s1 from $fl_in \n"; my $numin=0; while (<$in_fh>) { $numin++; } close $in_fh; my $tmv1 = tv_interval($tmv0,[gettimeofday]); print " Test Step 1 Found $numin records at $tmv1\n"; ### ### now using whole string... step 2 ### open my $in_fh, '<', $fl_in or die $!; my $text_file = do { local $/; <$in_fh> }; close $in_fh; my $tmv2 = tv_interval($tmv0,[gettimeofday])-$tmv1; my $lenstr=length($text_file); print " Test Step 2 Found $lenstr bytes in $tmv2 seconds\n"; ## ## Now slurp... step 3 ## $numin=0; $text_file = read_file($fl_in); $lenstr=length($text_file); my $tmv3 = tv_interval($tmv0,[gettimeofday])-$tmv1-$tmv2; print " Test Step 3 Found $lenstr bytes in $tmv3 seconds\n";

Replies are listed 'Best First'.
Re: SLURP Error
by marto (Cardinal) on Oct 10, 2014 at 12:54 UTC
Re: SLURP Error
by McA (Priest) on Oct 10, 2014 at 12:12 UTC

    Hi,

    when you look at the source of File::Slurp you see the following line:

    ${$buf_ref} =~ s/\015\012/\n/g if $is_win32 && !$opts->{'binmode'};

    This is done after reading the whole file which is in your case very big. So, you can do a litte test and do the same substitution command on the variable $text_file to see whether some memory or implementation boundaries are hit.

    You can also make the test vica versa setting the option 'binmode' to true to circumvent the substitution command.

    How many lines do you have in the text file?

    UPDATE: Have a look at this thread: Substitution Loop.

    Regards
    McA

      Thanks McA

      The typical file I want to handle is c 2.8GB with 100 m records

      I am using Slurp 9999.19 I will review using binmode as an alternative

Re: SLURP Error
by toolic (Bishop) on Oct 10, 2014 at 12:13 UTC
    • What version of File::Slurp are you using?
    • Did you try updating to latest version on CPAN (9999.19)?
    • What is line 254?