in reply to How to Breakup Numbers

$phone_number =~ /(\d{3})(\d{3})(\d{4})/ is the first thing that comes to mind, and will put the three parts into $1, $2, and $3.

But your database really should be doing this reasonably quickly even without splitting the numbers up, provided the table is indexed on the column that you're searching. If it's not, then I expect that adding an index will help immensely - probably much more than changing from one field to three would.

If you have the memory to burn on it, another option would be to load the list of numbers into a hash and then do hash lookups for the numbers you want to check. If you're doing large numbers of lookups in a single run of the program, this would probably be faster than the database, since it wouldn't need to go to disk for each lookup. But it requires a lot of memory and has the startup overhead of reading in the list(s) of numbers, so it will be worse than a database on small batches of lookups where there aren't as many searches to spread the startup cost across.

If both the do not call list and the list of numbers you're checking are already sorted, you can optimize it heavily by opening the two files concurrently and comparing the first line in each, then going to the next line in whichever file had the lower number (or, if they match, it's a match, so flag it and advance in both files). This has the strong advantage of only needing to read each line of each file once and avoiding having to do any actual searching, so it should be faster than any other non-database method and will probably be faster than using a database as well unless your dnc list is orders of magnitude larger than the list of numbers to look up (and even then, this may still be faster). The primary drawback is that it requires the input files to be sorted, which may impose a heavy startup time if that isn't already the case.

Replies are listed 'Best First'.
Re^2: How to Breakup Numbers
by SteveS832001 (Sexton) on Mar 04, 2008 at 16:47 UTC
    I tried the hash but there is just so much data it takes it 5 min to load everything to search
      How many numbers do you have and how many do you search for? I tried on 2.000.000 and it was all read in about 10 seconds. Searching 1000 numbers took less than 1s (Pentium D CPU 3GHz).