in reply to Re: Question about properly laying out a database
in thread Question about properly laying out a database

Given joelba's excellent suggestion, it may also help to create some indexes to speed up subsequent searches. Recall, in "Re: Question about properly laying out a database," above:

my %cars = ( 1002 => { make_id => 10, model_id => 27, year => 2001, price => 24000, }, ); my %makes = { 1 = "Chevrolet", 10 = "Acura", }; my %models = { 12 = "Camaro", 27 = "RSX", };

For example, indexing Makes might look like:

my %makes_idx = ( 1 => [1001, ... ], # cars that are Chevys 10 => [1002, ...], # cars that are Acuras ... );

Or, indexing Years might look like:

my %years_idx = ( 2001 => [ 1001, 1002, ... ], # cars that are 2001 models ... );

You get the idea.

Then searching by any single attribute that is indexed becomes nearly instantaneous. Doing a "JOIN" on two (or more) indexed attributes looks something like this:

# For example, finding all 2001 or newer Chevys my %temp_hash = reverse %makes; # gives the id for each make my $make_id = $temp_hash{'Chevrolet'); # get the id for Chevy my %make_set = map { $_=>1 } @{$makes_idx{$make_id}}; # set of cars t +hat are Chevys # search above for 2001 or newer models my @temp_years = grep { $_ >= 2001 } keys %years_idx; my @result_ids = grep { exists $make_set{$_} } @years_idx{@temp_years} +;
Update: Changed search example to look for a range rather than a specific year.

dmm

You can give a man a fish and feed him for a day ...
Or, you can
teach him to fish and feed him for a lifetime

Replies are listed 'Best First'.
Re: Re(2): Question about properly laying out a database
by joealba (Hermit) on Dec 13, 2001 at 02:44 UTC
    Hmm.. My gut reaction at first was "Well, it's a relatively small amount of data. And, you'll wind up recalculating the indexes for each search anyway. Great idea, but is it worth the effort here?"

    Then, I got a message from Stamp_Guy saying that he cannot use DB_File because complex data structures aren't supported. (C'mon.. how complex is a HoH?) So, now I am thoroughly convinced that Storable is a Good Idea, as long as the number of records doesn't get out of hand. And if you use Storable, the indexes can be stored objects too -- so the overhead of generating the indexes is gone from the search program. Sweet!

    So, here's the quick and dirty:
    %cars - Storable HoH as illustrated above, with any additional vehicle info added too

    %makes - Storable hash of make_id => make. This should probably be a complete list of makes, which you can find on cars.com.

    %models - Storable HoH of
    model_id => { model => 'modelname', make_id => make_id (key from %makes) }
    . You can build this make_id hash as your inventory grows. Be sure to leave off the model details/series. Ex. model => 330 should be in this hash to cover all BMW 330's. You don't necessarily need to have one entry for 330xi, 330ci, etc. Lexus is another pain in the rear when it comes to this model naming scheme.

    • Then, your indexes: %years_idx, %make_idx, %model_idx. Obviously you will need to recreate your indexes each time you make a change to the dealer's inventory. A minor race condition exists here if someone searches while you are updating the inventory, but I'll leave that for you to solve. :)