in reply to regex or split
It should also be noted that it *really* depends upon the size of your data sets. Pike++'s regex solution is efficient for small sets like the one you used in the example, but with a string that has only a thousand newlines it quickly falls behind your split/map solution.
If you are using Perl 5.8.0, it may be worth looking at PerlIO's scalar layer as well. For larger datasets it is very efficient to simply use ye ol' file slurp trick. Due to the overhead of the open() call that is necissary, this won't be the most efficient for smaller datasets.
#!/usr/bin/perl use warnings; use strict; $|++; use Benchmark qw( cmpthese ); my $str; # short example string $str=" this is a string example"; # longer string ( 1000 lines ) # my @chars = ( 'a' .. 'z', 'A' .. 'Z' ); # for ( 1..1000 ) { # $str .= $chars[ rand @chars ] for 0 .. rand @chars; # $str .= "\n"; # } cmpthese( 5000, { perl_io => sub { open( my $fh, "<:scalar", \$str) or die "$!\n"; my @data = <$fh>; }, split_map => sub { my @data=map { $_.="\n" } split (/\n/, $str); }, regex_pike => sub { my @data = split /(?<=\n)/, $str; }, } );
Rate perl_io split_map regex_pike perl_io 14085/s -- -46% -57% split_map 25907/s 84% -- -20% regex_pike 32468/s 131% 25% --
Rate regex_pike split_map perl_io regex_pike 79.4/s -- -40% -56% split_map 131/s 65% -- -27% perl_io 181/s 128% 38% --
|
|---|