Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Testing whether a value is in a range.

by kal (Hermit)
on Mar 10, 2003 at 20:27 UTC ( [id://241827]=perlquestion: print w/replies, xml ) Need Help??

kal has asked for the wisdom of the Perl Monks concerning the following question:

I have a problem, trying to find out whether or not a certain value is within a range of values. Basically, I'm working with postal/region codes, but for a lot of countries. Some countries have purely numeric postal codes, some have alpha numeric (an example of a UK postcode might be something like EC1N 4RT), and technically I guess some might be all letters.

Now, I'm trying to match postcodes up with Airports, for delivery of parcels. I have a data file, which is essentially a whitespace separated file (it's actually fixed width, but that's irrelevant.... ;), looking something like:

GB SW1 SW24 LHR

So, the first column is the country code (GB == Great Britain, in this instance), the next two columns are the "range", and the last is the Airport (London Heathrow). This is saying that for any postcode in SW1 to SW24, parcels should be sent to Heathrow.

How to do this?! I'm looking for something fairly algorithmically fast, since I'm going to be checking a lot of data. Most countries are fairly sensible and have purely numeric postal codes, so I think I will probably make that a special case (although, really, it's probably the normal case..). But is there any way of testing along the lines of if ('SW1' < $code && $code < 'SW24') then ..? I can think of something like chopping the text part apart from the numeric part, and doing a number of tests, but it seems like there should be an easier way that I just can't see.

Branewave: while I'm typing this, I have thought of another way; making an array with three values, the code to test, the end points of the range, and then calling sort on it and seeing where my code to test ends up (i.e., if it's in the middle of the sort, it is in the range) - I thin k Perl's default sort algo does exactly what I need. Right, I'm off to test ;) But, are there any better ideas? This might not be very fast... Cheers!

Replies are listed 'Best First'.
Re: Testing whether a value is in a range.
by Limbic~Region (Chancellor) on Mar 10, 2003 at 20:41 UTC
    kal,
    You can most certainly test if ( 'SW1' >= $code && $code <= 'SW24' ).

    But there are some things you want to keep in mind as the test is done ASCIIbetically.

  • Uppercase comes before lowercase
  • Numbers come before letters
  • Punctuation is mixed all over the place

    If you force all lower case and are sure that each character is the same type (either letter or number) in the corresponding value (low & high), then this method should work fine.

    Cheers - L~R

    Update: Clarified the final statement.

Re: Testing whether a value is in a range.
by Cody Pendant (Prior) on Mar 10, 2003 at 22:09 UTC
    I really think that you're going to have to come up with a regex for each country -- some of them could be re-used, but that weirdness of UK postcodes is going to force you to develop your own system for them.

    You could, for example, have a hash of regexes, and use the first column as the key. $regexes{'GB'} would be the one that dealt with your SW1 example, and it would be something like /\w+(\d+)(\w)?\s\w+(\d+)/ and test the values in $1 and $3.

    But in fact, come to think of it, you'd need at least two for the UK because as you can see, London postcodes have a different format to ones outside London.

    I wonder whether the UK postal authorities have some kind of database or flat-text file you could grab and munge.
    --

    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
    M-J D
      I wonder whether the UK postal authorities have some kind of database or flat-text file you could grab and munge.

      There are lots of details here but most of it costs $$$.

        What a pain.

        Anyway, the UK postal system, alphanumeric and variable as it is, is only one problem. I'm betting that its the most problematic of postcode systems though. The US and Australia have numeric-only systems, and Canada, I believe has a version of the UK system but without the London anomalies (there's SW, and NW, but no NE, just N, that kind of thing).

        It may be, however, that the problem is understimating the pinpoint accuracy of the UK system, where a postcode can be as accurate as one half of one side of a street.

        In the example of "SW1 <=> SW24" I'm betting that there is no such thing as SW25 or over, so the problem really can be solved as easily as "SW\d+ maps to LHR", without worrying about the numbers.

        Other UK postcodes just simply relate to towns. If you have one beginning BN? That's the town of Brighton, and the nearby airport is Gatwick. You wouldn't have to sweat the BN1, BN2 .. BN35 stuff.
        --

        “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
        M-J D
Re: Testing whether a value is in a range.
by adrianh (Chancellor) on Mar 10, 2003 at 22:36 UTC

    If speed is a real issue you might consider wasting memory/disk and pre-generate a (possibly partial) lookup table rather than doing lots of range comparisons.

Re: Testing whether a value is in a range.
by paulbort (Hermit) on Mar 11, 2003 at 18:32 UTC
    I'm assuming that for any lookup, you know what country you're working with, otherwise two countries that use the same number range will conflict. (The only example I can think of offhand is that GB and CA codes might collide.)

    Anyway, the one solution I don't see mentioned yet is SQL! I remember seeing somewhere around here a SQL module that was lightweight and could do this.

    SELECT airport FROM maptable WHERE postal_code_low < $mycode AND postal_code_high > $mycode AND country_code = 'GB'

    Index the table by country_code, postal_code_low, and postal_code_high, and you're in business. All you have to do is load the data into the database.

    Or, I've completely missed the point and wasted everyone's time.
    --
    Spring: Forces, Coiled Again!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://241827]
Approved by sschneid
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-19 19:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found