in reply to Re^2: Way to grep binary scalar without unpacking
in thread Way to grep binary scalar without unpacking

I never said it was obvious. I never said I was guessing. I believe it's faster because I did some benchmarks to test this, but that was some time ago. I could redo them, but so can the OP, and he has the benefit of having representative data.
  • Comment on Re^3: Way to grep binary scalar without unpacking

Replies are listed 'Best First'.
Re^4: Way to grep binary scalar without unpacking
by mwah (Hermit) on Oct 04, 2007 at 22:35 UTC
    I did another test, including a wrapped memchr as
    I stated in another reply. Including array handling,
    I found a factor of 3 (to index/regex) even on my non-
    mem alignment (compared to Core2) critical system:
    1769416 in Rate by_regex by_index by_memchr by_regex 51.2/s -- -3% -68% by_index 52.9/s 3% -- -67% by_memchr 162/s 217% 207% --
    The code used here was:
    use strict; use warnings; use Benchmark qw( cmpthese ); my $fn = '/boot/vmlinux-2.6.18.8-96-default.gz'; open my $fh, '<', $fn or die $!; read $fh, my $buffer, 2_000_000 or die $!; print length $buffer, " in\n"; close $fh; my $subs = { by_index => sub { my ($p0, @offs)=(-1, ()); push @offs, $p0 while +($p0=index $buffer, "\xaa", $p0+1) != -1 +; push @offs, $p0 while +($p0=index $buffer, "\xbb", $p0+1) != -1 +; push @offs, $p0 while +($p0=index $buffer, "\xcc", $p0+1) != -1 +; return 0 + @offs }, by_regex => sub { my @offs=(); push @offs, pos($buffer) while $buffer =~ /\xaa/g; push @offs, pos($buffer) while $buffer =~ /\xbb/g; push @offs, pos($buffer) while $buffer =~ /\xcc/g; return 0 + @offs }, by_memchr => sub { my @offs=(); my_memchr( \@offs, $buffer, "\xaa" ); my_memchr( \@offs, $buffer, "\xbb" ); my_memchr( \@offs, $buffer, "\xcc" ); return 0 + @offs } }; cmpthese -3, $subs; use Inline C => qq[ /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ void my_memchr(SV* rvav, SV* sv, SV *ch) { STRLEN srclen; char byte = *SvPV(ch, PL_na) ; char *svc = SvPV(sv, srclen); char *p = svc, *end = svc + srclen; AV *av = (AV*)SvRV(rvav); // if(SvTYPE(SvRV(rvav)) == SVt_PVAV) while((p=memchr(p, (int)byte, end-p)) !=0 && p<end) { av_push(av, newSViv(p-svc)); ++p; } } /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ ];
    Regards
    mwa
Re^4: Way to grep binary scalar without unpacking
by mwah (Hermit) on Oct 04, 2007 at 21:17 UTC
    ikegamiI could redo them, but so can the OP, and he has the benefit of having representative data.

    I did a short test on a Linux 2.6.18 in a VM within a XP (Athlon/64 3400+)
    (I just searched some hex codes within the kernel image.)
    use strict; use warnings; use Benchmark qw( cmpthese ); my $fn = '/boot/vmlinux-2.6.18.8-96-default.gz'; open my $fh, '<', $fn or die $!; read $fh, my $buffer, 2_000_000 or die $!; print length $buffer, " in\n"; close $fh; my $subs = { by_index => sub { my ($p0, @offs)=(-1, ()); push @offs, $p0 while +($p0=index $buffer, "\xaa", $p0+1) != -1 +; push @offs, $p0 while +($p0=index $buffer, "\xbb", $p0+1) != -1 +; push @offs, $p0 while +($p0=index $buffer, "\xcc", $p0+1) != -1 +; return 0 + @offs }, by_regex => sub { my @offs=(); push @offs, pos($buffer) while $buffer =~ /\xaa/g; push @offs, pos($buffer) while $buffer =~ /\xbb/g; push @offs, pos($buffer) while $buffer =~ /\xcc/g; return 0 + @offs } }; cmpthese( -3, $subs );
    Which ended up somehow interesting (corrected, machine w/no load):
    1769416 in Rate by_regex by_index by_regex 51.0/s -- -5% by_index 53.4/s 5% --
    Very new to me. Thanks to all involved ;-)

    Regards
    mwa