Re: Faster Luhn Check Digit Calculation?

I agree Inline::C will be fastest, but lookup tables can help a pure-Perl implementation a lot.

LUHN.pm:

package LUHN;

use warnings;
use strict;

use List::Util qw(sum);

use Data::Dump qw(dd);

sub cd_kschwab {  # pm#1226545
  use integer;
  my $total = 0;
  my $flip = 1;
  foreach my $c (reverse split //, shift) {
    $c *= 2 unless $flip = !$flip;
    while ($c) {
      $total += $c % 10;
      $c = $c / 10;
    }
  }
  return (10 - $total % 10) % 10;
}

my @x2_cast_out_9 = (
  # 0  1  2  3  4  5  6  7  8  9  # input number n
    0, 2, 4, 6, 8, 1, 3, 5, 7, 9  # output: n * 2, cast out 9
    );

my @check_digit = map { (10 - $_ % 10) % 10 } 0 .. 9 * 15;

sub cd_AnomalousMonk {

    my $ccn = shift;

    my $total = 0;

    for (1, 3, 5, 7, 9, 11, 13) {
        $total += substr($ccn, $_, 1);
        }

    for (0, 2, 4, 6, 8, 10, 12, 14) {
        $total += $x2_cast_out_9[ substr($ccn, $_, 1) ];
        }

    return $check_digit[ $total ];

    }

sub cd_BrowserUk {  # pm#1226582
    use integer;
    my $s = $_[ 0 ];
    my $total = 0;
    for my $i ( 0 .. 14 ) {
        my $d = substr( $s, $i, 1 );
        unless( $i & 1 ) {
            $d *= 2;
            $d -= 9 if $d > 9;
        }
        $total += $d;
    }
    $total *= 9;
    return chop $total;
}

1;
[download]

test_LUHN_1.pl:

# test_LUHN_1.pl                                            01dec18waw

use warnings;
use strict;

use Benchmark qw(cmpthese);

use LUHN;

for ('000' .. '999') {
    my $ccn = '401135000000' . $_;
    my $kschwab_cd       = LUHN::cd_kschwab       ($ccn);
    my $AnomalousMonk_cd = LUHN::cd_AnomalousMonk ($ccn);
    die "$ccn: kschwab $kschwab_cd != $AnomalousMonk_cd"
        if $kschwab_cd != $AnomalousMonk_cd;
    }
print "tested cc number range kschwab vs. AnomalousMonk ok \n";

for ('000' .. '999') {
    my $ccn = '401135000000' . $_;
    my $kschwab_cd   = LUHN::cd_kschwab   ($ccn);
    my $BrowserUk_cd = LUHN::cd_BrowserUk ($ccn);
    die "$ccn: kschwab $kschwab_cd != $BrowserUk_cd"
        if $kschwab_cd != $BrowserUk_cd;
    }
print "tested cc number range kschwab vs. BrowserUk ok \n";

cmpthese (-1, {
  cd_kschwab   => sub { LUHN::cd_kschwab       ('401135000000000') },
  cd_AM        => sub { LUHN::cd_AnomalousMonk ('401135000000000') },
  cd_BrowserUk => sub { LUHN::cd_BrowserUk     ('401135000000000') },
  });
[download]

Output:

c:\@Work\Perl\monks\kschwab>perl test_LUHN_1.pl
tested cc number range kschwab vs. AnomalousMonk ok
tested cc number range kschwab vs. BrowserUk ok
                 Rate   cd_kschwab cd_BrowserUk        cd_AM
cd_kschwab    52764/s           --         -40%         -62%
cd_BrowserUk  88033/s          67%           --         -37%
cd_AM        139999/s         165%          59%           --
[download]

Update: My solution and some of the others are hard-wired for generating a checksum for a 15-digit string. Here's a generalized variation. Caution: this is not well tested. Also, I don't know that I like the @check_digit look-up array all that much; it might chew up a lot of space in certain circumstances. BrowserUk's $total *= 9; return chop $total; eats less space and is only very slightly slower than the lookup.

sub cd_AM_HOP {  # iterator solution

    my ($n_digits,  # n of digits for which to generate luhn checksum
        ) = @_;

    my $order = $n_digits & 1;
    my @straight = grep +($_ & $order), 0 .. $n_digits-1;  # dd '---',
+ \@straight;
    my @adjusted = grep !($_ & $order), 0 .. $n_digits-1;  # dd '===',
+ \@adjusted;

    my @check_digit = map { (10 - $_ % 10) % 10 } 0 .. 9 * $n_digits;

    return sub {

        my $ccn = shift;  # digits for which to generate checksum

        my $total = 0;

        for (@straight) {
            $total += substr($ccn, $_, 1);
            }

        for (@adjusted) {
            $total += $x2_cast_out_9[ substr($ccn, $_, 1) ];
            }

      # return $check_digit[ $total ];  # return check digit
        $total *= 9;
        return chop $total;  # return check digit

        };  # end sub iterator

    }
[download]

Give a man a fish: <%-{-{-{-<

Comment on Re: Faster Luhn Check Digit Calculation? Select or Download Code

Replies are listed 'Best First'.
Re^2: Faster Luhn Check Digit Calculation? by kschwab (Vicar) on Dec 01, 2018 at 16:20 UTC
Converted a hardcoded version of yours to Inline::C, and it's pretty much the same performance as BrowserUK's Inline::C: $ perl ./script Benchmark: timing 500 iterations of Inline::C (AnomalousMonk), Inline::C (BrowserUK)... Inline::C (AnomalousMonk): 6 wallclock secs ( 6.21 usr + 0.00 sys = 6.21 CPU) @ 80.52/s (n=500) Inline::C (BrowserUK): 6 wallclock secs ( 6.14 usr + 0.00 sys = 6.14 CPU) @ 81.43/s (n=500) Appreciate it! Conversion below: `int cd4( char ccn ) { int total = 0; int i; int co9[]={0,2,4,6,8,1,3,5,7,9}; int straight[]={1,3,5,7,9,11,13}; int adjusted[]={0,2,4,6,8,10,12,14}; for (i=0;i<7;i++) { total += ccn[straight[i]]-48; } for (i=0;i<8;i++) { total += co9[ccn[adjusted[i]]-48]; } total =9; return total %10; }` [download]	[reply] [d/l]
Re^3: Faster Luhn Check Digit Calculation? by AnomalousMonk (Archbishop) on Dec 02, 2018 at 02:25 UTC
Here's another twist that might squeeze out a few more computrons: use array literals (if that's the correct terminology; introduced with C99 IIRC; I assume Inline::C supports C99) to compile the arrays rather than building local arrays on each subroutine call (if you don't want to just declare the arrays ~~`static`~~\| `static const`, which also ~~works~~ \| should work). I haven't looked at the assembler output, but the idea (the hope?) is that the compiler would make these statics. There are two versions of the function below; both work. One is more heavily macro-ized because I wanted to see how far I could push this approach; for reasons of maintainability, I'm not sure I'd want it in production code. Only minimal testing has been done; I leave that to you :). `t_array_literal_1.c`: Read more... (3 kB) Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: Faster Luhn Check Digit Calculation? by LanX (Saint) on Dec 01, 2018 at 21:40 UTC
Lookup tabled were my first idea too ... but why not a string of digits together at the same time? Am I missing something? 6 digits are 1 million entries, probably using substr is even faster than array lookup, for sure it's demanding less memory. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply]
Re^3: Faster Luhn Check Digit Calculation? by AnomalousMonk (Archbishop) on Dec 02, 2018 at 01:36 UTC
... why not a string of digits together at the same time? If I correctly understand what you're saying, something like `my $check_digits = join '', map { chr((10 - $_ % 10) % 10) } 0 .. 9 * $n_digits;` `...` `return substr $check_digits $total, 1;` (returning the check digit as a character) certainly would be a more memory-conservative approach than using an array of, e.g., 136 numbers. However, my suspicion is that when you get through with the substr call, you've burned about as much time as with the just-as-simple `$total = 9; return chop $total;` calls. 6 digits are 1 million entries ...* I don't understand this point: where do the 1 million entries come from? Would you be building a lookup string from every possible 6-digit combination? The OPed question posited 15 digits; that's a really long string. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^4: Faster Luhn Check Digit Calculation? by LanX (Saint) on Dec 02, 2018 at 02:01 UTC
To be clearer, I suppose that `L("XXYY") = ( L("XX") + L("YY") ) % 10` for all even length substrings XX and YY => a divide and conquer could be applied => computation complexity replaced with memory complexity. > The OPed question posited 15 digits; that's a really long string. One would do 3 lookups, leading zeros of the left most chunk shouldn't influence the checksum. `L("XX") = L("0XX") = L("00XX") ) = ...` > where do the 1 million entries come from? 6 digits sound like a good compromise, you'd need even amount and 100 Meg for 8 digits sound exaggerated. (A Perl array has an factor 10 overhead IIRC) Anyway a C algorithm which fits into the line cache of the processor might still be faster. Not sure if a 10k string for a 4 digit partition would always fit into the L1 cache. update the best approach might be a C program which dynamically benchmarks different chunk sizes to test the effect of line caches before building the lookup table. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery FootballPerl is like chess, only without the dice}	[reply] [d/l] [select]
Re^5: Faster Luhn Check Digit Calculation? by AnomalousMonk (Archbishop) on Dec 02, 2018 at 02:47 UTC
Re^6: Faster Luhn Check Digit Calculation? by LanX (Saint) on Dec 02, 2018 at 02:59 UTC
Re^6: Faster Luhn Check Digit Calculation? by LanX (Saint) on Dec 02, 2018 at 03:09 UTC

update