Re: opposite of index+rindex? How-to? Needed?

You can use tr///c to replace all characters other than space and tab to something, and then use index to search for that character.

#!/usr/bin/perl
use warnings;
use strict;

sub regex_pos {
    my ($string) = @_;
    $string =~ /[^ \t]/g;
    return pos($string) - 1
}

sub tr_pos {
    my ($string) = @_;
    (my $tr = $string) =~ tr/ \t/!!/c;
    return index $tr, '!'
}

for my $s ("  \t x  ", "\t\t\t\t\t\t                  \xff...\t ") {
    regex_pos($s) == tr_pos($s) or die;
}

use Benchmark qw{ cmpthese };

my $s = ' ' x 200 . "\t" x 200 . "\x01" . " " x 200;
cmpthese(-3, {
    regex => sub { regex_pos($s) },
    tr    => sub { tr_pos($s)    },
});
[download]

On my machine, it seems about 3 times faster than regex in 5.26.1:

          Rate regex    tr
regex 316530/s    --  -67%
tr    954355/s  202%    --
[download]

but only marginally faster in blead perl:

          Rate regex    tr
regex 896519/s    --   -6%
tr    954359/s    6%    --
[download]

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Comment on Re: opposite of index+rindex? How-to? Needed? Select or Download Code

Replies are listed 'Best First'.
Re^2: opposite of index+rindex? How-to? Needed? by dave_the_m (Monsignor) on Aug 30, 2019 at 07:43 UTC
but only marginally faster in blead perl From 5.30.0 perldelta: `Regular expression pattern matching of things like C<qr/[^I<a>]/> is significantly sped up, where I<a> is any ASCII character. Other class +es can get this speed up, but which ones is complicated and depends on th +e underlying bit patterns of those characters, so differs between ASCII and EBCDIC platforms, but all case pairs, like C<qr/[Gg]/> are include +d, as is C<[^01]>.` [download] Dave.	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: opposite of index+rindex? How-to? Needed?
by dave_the_m (Monsignor) on Aug 30, 2019 at 07:43 UTC

but only marginally faster in blead perl

Regular expression pattern matching of things like C<qr/[^I<a>]/> is
significantly sped up, where I<a> is any ASCII character.  Other class
+es
can get this speed up, but which ones is complicated and depends on th
+e
underlying bit patterns of those characters, so differs between ASCII
and EBCDIC platforms, but all case pairs, like C<qr/[Gg]/> are include
+d,
as is C<[^01]>.
[download]

Dave.

[reply]
[d/l]