Re^3: (german) region code detection

Why would you refuse a hash ?

There can't be so much regions as to not easily keep them in memory, in a simple hash like

my %prefixes = ( '04025' = [ 'Region1, Region2, Region3' ],
                 ...
               );
[download]

And afterwards, a straightforward check like

my ($pref5, $pref4, $pref3, $pref2) = map { substr( $phone, 0, $_ ) }
                                          (5, 4, 3, 2);

my $prefix_length = exists $prefixes{$pref5} ? 5
                  : exists $prefixes{$pref4} ? 4
                  : exists $prefixes{$pref3} ? 3
                  : exists $prefixes{$pref2} ? 2
                  :                            0
                  ;

my $formatted_phone = join( ' ',
                            substr( $phone, 0, $prefix_length),
                            substr( $phone, $prefix_length),
                      );
[download]

should work rather very effectively. If you have thought about this already, why do you think it would be expensive/ineffective/inadequate ?

Krambambuli
---

Comment on Re^3: (german) region code detection - request for thoughts Select or Download Code

Replies are listed 'Best First'.
Re^4: (german) region code detection - request for thoughts by Skeeve (Parson) on Aug 20, 2008 at 11:13 UTC
I like one regexp match more than several substring comparisons. And I didn't want a "huge" array in my code. Just one "simple" regex. My module for matching region codes and international country codes is 9K while the region codes alone are 32K. 32K is not a huge size nowadays, but I'm dated back from the ages of the C64 ;-) Your solution is quite clever but would need some enhancements to provide for Different minimal lengths of region codes Different maximum lengths of region codes It sholud find those values on it's own `s$$([},&%#}/&/]+}%&{});#$&&s&&$^X.($'^"%]=\&(\|?{%` `+`.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e	[reply] [d/l] [select]
Re^5: (german) region code detection - request for thoughts by Krambambuli (Curate) on Aug 20, 2008 at 11:27 UTC
Different minimal lengths of region codes Different maximum lengths of region codes It sholud find those values on it's own Update But... it's all there already ? There are no string comparisons, and the order of look-ups assures that the longest existent key/prefix always win. Oh, it's not, I just misunderstood your points. But definitely not hard to add, if really needed: `use List::Util; my $min_prefix_length = min keys %prefixes; my $max_prefix_length = max keys %prefixes; my $prefix_length = $max_prefix_length; while ( $prefix_length-- >= $min_prefix_length) { last if exists $prefixes{ substr( $phone, 0, $prefix_length) }; } # Error/inexistent prefix if $prefix_length < $min_prefix_length;` [download] etc. Krambambuli ---	[reply] [d/l]
Re^6: (german) region code detection - request for thoughts by Skeeve (Parson) on Aug 20, 2008 at 15:22 UTC
Don't get me wrong. I just wanted to point out what was still missing, compared to my approach. I didn't want you to program that for me. Nevertheless ++ for your effort! `s$$([},&%#}/&/]+}%&{});#$&&s&&$^X.($'^"%]=\&(\|?{%` `+`.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e	[reply] [d/l] [select]
Re^4: (german) region code detection - request for thoughts by Skeeve (Parson) on Aug 20, 2008 at 16:20 UTC
krambambuli wrote: should work rather very effectively. I was unsure about that and so I benchmarked. I used the full list of 5132 region codes. Have no fear! There are not 32K of region codes following, just the (about) 9K of my regular expression which I use to generate the region code list and also the test data. Read more... (24 kB) This is the result: `Rate Skeeve krambambuli Skeeve 10.7/s -- -30% krambambuli 15.4/s 43% --` [download] ~~So regular expressions seem to be very efficient. The code is 30% faster.~~ ;-) It isn't. krambambuli is right. `s$$([},&%#}/&/]+}%&{});#$&&s&&$^X.($'^"%]=\&(\|?{%` `+`.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e	[reply] [d/l] [select]
Re^5: (german) region code detection - request for thoughts by Krambambuli (Curate) on Aug 20, 2008 at 17:16 UTC
This is the result: `Rate Skeeve krambambuli Skeeve 10.7/s -- -30% krambambuli 15.4/s 43% --` [download] Read again... ;) The results are saying the opposite: your code executes approx. 10 times in a second, mine does 15 times. Krambambuli ---	[reply] [d/l]