in reply to Re^3: (german) region code detection - request for thoughts
in thread (german) region code detection - request for thoughts

krambambuli wrote:
should work rather very effectively.

I was unsure about that and so I benchmarked.

I used the full list of 5132 region codes. Have no fear! There are not 32K of region codes following, just the (about) 9K of my regular expression which I use to generate the region code list and also the test data.

#!/usr/bin/perl use strict; use warnings; use Benchmark qw(cmpthese); # All region codes as a big regular expression my $RC=qr/(0(?:2(?:0(?:[12389]|4[135]|5[123468]|6[456])|1(?:[14]|0[234 +]|29?|3[1237]|5[012346789]|6[123456]|7[1345]|8[123]|9[12356])|2(?:[18 +]|0[2345678]|2[2345678]|3[2345678]|4[12345678]|5[1234567]|6[123456789 +]|7[12345]|9[1234567])|3(?:[14]|0[123456789]|2[3457]|3[0123456789]|5[ +12345789]|6[0123456789]|7[12345789]|8[12345789]|9[12345])|4(?:1|0[123 +456789]|2[123456789]|3[123456]|4[013456789]|5[123456]|6[12345]|7[1234 +]|8[2456])|5(?:1|0[12456789]|2[0123456789]|3[234568]|4[1235678]|5[123 +45678]|6[12345678]|7[12345]|8[12345678]|9[0123456789])|6(?:1|0[123456 +78]|2[012345678]|3[0123456789]|4[1234567]|5[1234567]|6[123467]|7[1234 +5678]|8[0123456789]|9[1234567])|7(?:1|2[12345]|3[23456789]|4[123457]| +5[01234589]|6[1234]|7[0123456789])|8(?:1|0[1234]|2[12345678]|3[123456 +789]|4[12345]|5[012356789]|6[1234567]|7[1234])|9(?:1|0[2345]|2[123457 +8]|3[1234578]|4[1234578]|5[1234578]|6[1234]|7[123457]|8[12345]|9[1234 +]))|3(?:0|3(?:[15]|0(?:[123467]|5[13456]|8[023456789]|9[34])|2(?:[127 +89]|0[0123456789]|3[012345789])|3(?:[124578]|3[12345678]|6[123456789] +|9[345678])|4(?:[1246]|3[23456789]|5[124678]|7[023456789]|84)|6(?:[12 +46]|0[123456789]|3[12345678]|5[234567]|7[123456789])|7(?:[125789]|0[1 +2348]|3[1234]|4[12345678]|6[023456789])|8(?:[1256]|3[0123456789]|4[13 +45679]|7[02345678])|9(?:[145]|2[012345689]|3[123]|6[23456789]|7[01234 +56789]|8[123469]))|4(?:[015]|2(?:[135]|0[2345678]|2[1234]|4[1234]|6[1 +23]|9[123456789])|3(?:[1357]|2[124578]|4[12345678]|6[1234]|8[123456]) +|4(?:[13578]|2[23456]|4[13456]|6[1234567]|9[12345678])|6(?:[1246]|0[0 +12345679]|3[2356789]|5[1234689]|7[123]|9[12])|7(?:[1356]|2[12]|4[1235 +6]|7[1234569]|8[1235])|9(?:[1346]|0[1345679]|2[0123456789]|5[3456]|7[ +356789]))|5(?:[15]|0(?:[14]|2[012345678]|3[23]|5[2345678])|2(?:[12358 +9]|0[0123456789]|4[0123456789]|6[345678])|3(?:[1357]|2[2345679]|4[123 +]|6[12345]|8[3456789])|4(?:[1246]|3[34569]|5[123456]|7[12345678])|6(? +:[1234]|0[0123456789]|9[12345678])|7(?:[13468]|2[2345678]|5[123456]|7 +[12345]|9[23567])|8(?:[13568]|2[02356789]|4[1234]|7[234567]|9[12345]) +|9(?:[1246]|3[0123456789]|5[12345]|7[1345]))|6(?:[15]|0(?:[1356]|2[01 +23456789]|4[123]|7[124567]|8[123457])|2(?:[123489]|0[0123456789]|5[23 +456789])|3(?:[12456]|3[012345678]|7[0123456789])|4(?:[1347]|2[1234567 +8]|5[0123489]|6[12345]|8[1234])|6(?:[13]|0[12345678]|2[1234568]|4[023 +456789]|5[123]|9[12345])|7(?:[12579]|0[12345]|3[0123456789]|4[1234]|6 +[1246]|8[12345])|8(?:[12356]|4[0123456789]|7[013458])|9(?:[135]|2[012 +3456789]|4[013456789]|6[123456789]))|7(?:[15]|2(?:[1234567]|0[0234678 +9]|9[12345678])|3(?:[1357]|2[0123456789]|4[12346789]|6[0123456789]|8[ +1234])|4(?:[145]|2[123]|3[0123456789]|6[234578])|6(?:[12345]|0[012345 +6789])|7(?:[1234]|5[24567]))|8(?:[15]|2(?:1|0[123456789]|2[0123456789 +]|3[1234]|9[234567])|3(?:[1468]|0[0123456789]|2[012345678]|3[1234]|5[ +123456]|7[0123456789]|9[123])|4(?:[1347]|2[23456789]|5[0123456789]|6[ +1246]|8[1234568])|6[01356789]|7(?:[1467]|2[0123456789]|3[1235678]|5[0 +123456789]|8[012345789]|9[123467])|8(?:[136]|2[12345678]|4[1234578]|5 +[012345689]|7[123456]))|9(?:[15]|0(?:[12479]|0[0123456789]|3[01234567 +89]|5[0123456789]|6[12]|8[0123456789])|2(?:[1358]|0[0123456789]|2[123 +456]|4[12345678]|6[2345678]|9[12345678])|3(?:[1357]|2[012345789]|4[12 +3456789]|6[123456]|8[2346789]|9[0123456789])|4(?:[134679]|0[012345678 +9]|2[12345678]|5[123456789]|8[12345789])|6(?:[123456789]|0[012345678] +)|7(?:[136]|2[1234678]|4[0123456789]|5[1234]|7[123456789])|8(?:[147]| +2[0123456789]|3[123]|5[123456789]|6[123]|8[123456789])|9(?:[1468]|2[1 +23456789]|3[1234]|5[12345679]|7[1235678]|9[123456789])))|4(?:0|1(?:0[ +123456789]|2[0123456789]|3[123456789]|4[01234689]|5[12345689]|6[12345 +6789]|7[123456789]|8[0123456789]|9[12345])|2(?:1|0[23456789]|2[1234]| +3[0123456789]|4[0123456789]|5[12345678]|6[0123456789]|7[1234567]|8[12 +3456789]|9[2345678])|3(?:1|0[23578]|2[012346789]|3[0123456789]|4[0234 +6789]|5[12345678]|6[1234567]|7[12]|8[12345]|9[234])|4(?:1|0[123456789 +]|2[12356]|3[12345]|4[1234567]|5[1234568]|6[123456789]|7[12345789]|8[ +0123456789]|9[123456789])|5(?:1|0[12345689]|2[123456789]|3[12345679]| +4[1234567]|5[0123456789]|6[1234])|6(?:1|0[23456789]|2[1234567]|3[0123 +456789]|4[12346]|51|6[12345678]|7[1234]|8[1234])|7(?:1|0[2345678]|2[1 +2345]|3[1234567]|4[0123456789]|5[12345678]|6[123456789]|7[0123456789] +|9[123456])|8(?:1|0[23456]|2[123456789]|3[023456789]|4[123456789]|5[1 +23456789]|6[12345]|7[1234567]|8[12345]|9[23])|9(?:1|0[23]|2[012345678 +9]|3[12345689]|4[12345678]|5[0123456789]|6[12345678]|7[1234567]))|5(? +:0(?:2[12345678]|3[1234567]|4[12345]|5[123456]|6[023456789]|7[1234]|8 +[23456])|1(?:1|0[123589]|2[136789]|3[01256789]|4[123456789]|5[1234567 +89]|6[12345678]|7[1234567]|8[1234567]|9[0123456789])|2(?:1|0[12345678 +9]|2[1234568]|3[12345678]|4[1245678]|5[012345789]|6[123456]|7[1234567 +8]|8[123456]|9[2345])|3(?:1|0[0123456789]|2[0123456789]|3[12345679]|4 +[14567]|5[12345678]|6[12345678]|7[123456789]|8[1234])|4(?:1|0[1234567 +9]|2[123456789]|3[123456789]|4[12345678]|5[123456789]|6[1245678]|7[12 +3456]|8[12345]|9[12345])|5(?:1|0[23456789]|2[012345789]|3[123456]|4[1 +23456]|5[123456]|6[12345]|7[1234]|8[23456]|9[234])|6(?:1|0[123456789] +|2[123456]|3[123456]|4[12345678]|5[0123456789]|6[12345]|7[1234567]|8[ +123456]|9[123456])|7(?:1|0[234567]|2[123456]|3[1234]|4[123456]|5[1234 +5]|6[13456789]|7[1234567])|8(?:1|0[2345678]|2[0123456789]|3[123456789 +]|4[012345689]|5[012345789]|6[12345]|7[2345]|8[23])|9(?:1|0[123456789 +]|2[123456]|3[12345679]|4[12345678]|5[1234567]|6[123456]|7[135678]))| +6(?:9|0(?:0[23478]|2[012346789]|3[1234569]|4[123456789]|5[0123456789] +|6[12368]|7[1348]|8[1234567]|9[23456])|1(?:1|0[123456789]|2[02346789] +|3[012345689]|4[24567]|5[01245789]|6[1234567]|7[12345]|8[12345678]|9[ +02568])|2(?:1|0[12345679]|2[012346789]|3[123456789]|4[12345679]|5[123 +45678]|6[123456789]|7[12456]|8[1234567]|9[12345678])|3(?:1|0[12345678 +]|2[123456789]|3[123456789]|4[0123456789]|5[12356789]|6[1234]|7[12345 +]|8[1234567]|9[12345678])|4(?:1|0[0123456789]|2[0123456789]|3[0123456 +89]|4[012345679]|5[12345678]|6[1245678]|7[123456789]|8[23456])|5(?:1| +0[0123456789]|2[234567]|3[123456]|4[12345]|5[0123456789]|6[123456789] +|7[123458]|8[0123456789]|9[12345679])|6(?:1|2[0123456789]|3[013456789 +]|4[12345678]|5[0123456789]|6[013456789]|7[02345678]|8[1234]|9[123456 +78])|7(?:1|0[1346789]|2[12345678]|3[1234567]|4[1234567]|5[12345678]|6 +[123456]|7[123456]|8[123456789])|8(?:1|0[234569]|2[14567]|3[12345678] +|4[123489]|5[12345678]|6[1456789]|7[123456]|8[178]|9[3478]))|7(?:0(?: +2[123456]|3[1234]|4[123456]|5[123456]|6[236]|7[123]|8[12345])|1(?:1|2 +[123456789]|3[012345689]|4[12345678]|5[012346789]|6[123456]|7[123456] +|8[1234]|9[12345])|2(?:1|0[234]|2[0123456789]|3[1234567]|4[023456789] +|5[0123456789]|6[0123456789]|7[1234567])|3(?:1|0[023456789]|2[1234567 +89]|3[1234567]|4[0345678]|5[12345678]|6[1234567]|7[13456]|8[123456789 +]|9[12345])|4(?:1|0[234]|2[023456789]|3[123456]|4[0123456789]|5[12345 +6789]|6[1234567]|7[12345678]|8[23456])|5(?:1|0[23456]|2[0245789]|3[12 +34]|4[123456]|5[12345678]|6[123456789]|7[0123456789]|8[1234567])|6(?: +1|02|2[0123456789]|3[123456]|4[123456]|5[1234567]|6[0123456789]|7[123 +456]|8[12345])|7(?:1|0[23456789]|2[0123456789]|3[12345689]|4[12345678 +]|5[1345]|6[12345]|7[13457])|8(?:1|0[2345678]|2[123456]|3[123456789]| +4[1234]|5[1234])|9(?:1|0[34567]|3[0123456789]|4[0123456789]|5[0123457 +89]|6[1234567]|7[1234567]))|8(?:9|0(?:2[0123456789]|3[12345689]|4[123 +56]|5[1234567]|6[1234567]|7[123456]|8[123456]|9[12345])|1(?:1|0[2456] +|2[1234]|3[13456789]|4[123456]|5[12378]|6[15678]|7[016789]|9[123456]) +|2(?:1|0[2345678]|2[123456]|3[012346789]|4[1356789]|5[01234789]|6[123 +56789]|7[12346]|8[12345]|9[123456])|3(?:1|0[2346]|2[0123456789]|3[012 +345678]|4[0123456789]|6[123456789]|7[023456789]|8[0123456789]|9[2345] +)|4(?:1|0[234567]|2[123467]|3[12345]|4[123456]|5[02346789]|6[01234567 +89])|5(?:1|0[12345679]|3[12345678]|4[123456789]|5[012345678]|6[12345] +|7[1234]|8[123456]|9[123])|6(?:1|2[123489]|3[013456789]|4[0129]|5[012 +467]|6[12345679]|7[01789]|8[1234567])|7(?:1|0[23456789]|2[12345678]|3 +[12345]|4[12345]|5[12346]|6[12456]|7[1234]|8[12345])|8(?:1|0[12356789 +]|2[12345]|4[1567]|5[1678]|6[012789]))|9(?:0(?:6|7[012345678]|8[01234 +56789]|9[0123479])|1(?:1|0[1234567]|2[0236789]|3[12345]|4[123456789]| +5[12345678]|6[1234567]|7[0123456789]|8[0123456789]|9[0123456789])|2(? +:1|0[123456789]|2[01235789]|3[1234568]|4[123456]|5[1234567]|6[0123456 +789]|7[0123456789]|8[0123456789]|9[2345])|3(?:1|0[23567]|2[13456]|3[1 +23456789]|4[0123456789]|5[0123456789]|6[0345679]|7[12345678]|8[123456 +]|9[12345678])|4(?:1|0[123456789]|2[012346789]|3[1345689]|4[12345678] +|5[1234]|6[123456789]|7[1234]|8[0124]|9[1235789])|5(?:1|0[2345]|2[123 +456789]|3[123456]|4[23456789]|5[123456]|6[0123456789]|7[123456])|6(?: +1|0[2345678]|2[1245678]|3[123456789]|4[12345678]|5[123456789]|6[12345 +6]|7[1234567]|8[123])|7(?:1|0[148]|2[0123456789]|3[2345678]|4[1245678 +9]|6[123456]|7[123456789])|8(?:1|0[2345]|2[023456789]|3[1234567]|4[12 +345678]|5[1234567]|6[15789]|7[123456])|9(?:1|0[1345678]|2[0123456789] +|3[1235678]|4[12345678]|5[123456]|6[123456]|7[12345678]))))/; # Generate all region codes from that. # So that we have some data to work with. my @region_code; foreach ("02000".."99999") { push (@region_code, $1) if /^$RC/o; } # create a region code hash for krambambuli's algorithm my %prefixes; @prefixes{@region_code}= (); # make the testdata a bit longer $_.= "xxxxxx" foreach @region_code; cmpthese (1000, { krambambuli => sub { krambambuli($_, \%prefixes) foreach @region_cod +e; }, Skeeve => sub { skeeve($_, \$RC) foreach @region_code; }, }); sub krambambuli { my ($phone, $prefixes)= @_; my ($pref5, $pref4, $pref3, $pref2) = map { substr( $phone, 0, $_ +) } (5, 4, 3, 2); my $prefix_length = exists $prefixes->{$pref5} ? 5 : exists $prefixes->{$pref4} ? 4 : exists $prefixes->{$pref3} ? 3 : exists $prefixes->{$pref2} ? 2 : 0 ; substr($phone, $prefix_length,0)= ' '; # just a slight optimisatio +n (I guess) return $phone; } sub skeeve { my ($phone, $reref)= @_; $phone=~ s/^$$reref/$1 /; return $phone; }
This is the result:
Rate Skeeve krambambuli Skeeve 10.7/s -- -30% krambambuli 15.4/s 43% --

So regular expressions seem to be very efficient. The code is 30% faster. ;-) It isn't. krambambuli is right.


s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

Replies are listed 'Best First'.
Re^5: (german) region code detection - request for thoughts
by Krambambuli (Curate) on Aug 20, 2008 at 17:16 UTC
    This is the result:
    Rate Skeeve krambambuli Skeeve 10.7/s -- -30% krambambuli 15.4/s 43% --
    Read again... ;)

    The results are saying the opposite: your code executes approx. 10 times in a second, mine does 15 times.

    Krambambuli
    ---