lamberms has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow monks. I have a problem I can't seem to wrap my head around (at least not efficiently). My company has a price history table that contains a list of part numbers, a pricelist name, a date and finally the cost that part was on that particular date. When we change a part's price it then gets added to this table.

My problem is, given a record of parts sold and the dates they are sold on, I want to tell my boss what the pricebook price was on that particular date. I could do this the horrible way and just loop through a hash set up like:

$partprice{$partnbr}{$pricelistname}{$date} = $price
for each part date combination. That seems a rather horrid way to go however. Is there any way of storing the date ranges in such a way as to make this more efficient. Thanks for your help.

Best Regards,
Mike

Replies are listed 'Best First'.
Re: Date Ranges Question
by Skeeve (Parson) on Oct 25, 2005 at 19:24 UTC

    If the table is in a SQL database, best would be to use some DBI module to access the database and get the date with a SELECT statement like

    SELECT FROM table WHERE partnbr=? AND date<? ORDER BY DATE DESCENDING LIMIT 1

    I'm no SQL expert, but maybe this will set you on the right track.


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Date Ranges Question
by shemp (Deacon) on Oct 25, 2005 at 23:22 UTC
    To gain a bit of a speed increase, you could convert the dates to integer values, as an offset from some base date that all other dates will be after. Choose something like january 1, 1800, or whatever base date is appropriate to your application. The conversion to an offset can be handled by the Date::Calc function Delta_Days(), getting the delta days from the base date to the desired date. Then all of your comparisons can go a little quicker, being integer comparisons, vs date comparisons or whatever.

    Kind of an illegitimate cousin of the schwartzian transformation.

    I use the most powerful debugger available: print!
      This probably won't work as you expect. While it is definitly true interger indexes are always faster than strings, I imagine it would auctually slow things down as it applies to dates. This depends on his SQL engine though. If your using postgres your going to want to see ago for instance. SQL engines normally take dates in numerous string formats, output dates in numerous formats, and store them internally in what ever way is the most optimized - which is surely not string format.
      8.5.1.4. Intervals interval values can be written with the following syntax: @ quantity unit quantity unit... direction Where: quantity is a number (possibly signed); unit is second, minute, hour, day, week, month, year, decade, century, millennium, or abbreviations or plurals of these units; direction can be ago or empty. The at sign (@) is optional noise. The amounts of different units are implicitly added up with appropriate sign accounting. Quantities of days, hours, minutes, and seconds can be specified without explicit unit markings. For example, '1 12:59:10' is read the same as '1 day 12 hours 59 min 10 sec'. The optional precision p should be between 0 and 6, and defaults to the precision of the input literal.
      Which is all optimized by the engine.


      Evan Carroll
      www.EvanCarroll.com
Re: Date Ranges Question
by GrandFather (Saint) on Oct 25, 2005 at 19:52 UTC

    If the data is not currently in a database and it is not possible to put it in one, then you should create a second hash keyed on the date:

    $partprice{$date}{$partnbr}{$pricelistname} = $price

    It's not clear to me that you need to loop through the hash at all. What your line of code implies is that you have a HoHoH and that you are directly indexing to yet the value you want, which is about as efficient as you could want!

    If you have code you are having trouble with in this regard, post an example showing the issue.


    Perl is Huffman encoded by design.

      GrandFather, the problem seems to be this:

      %partprice = ( 1 => { '2005-01-01' => 1.99, # new year special. '2005-01-02' => 8.99, # back to normal. '2005-02-15' => 9.99, # price bump. '2005-04-01' => 9.49, # on sale. '2005-04-08' => 9.99, # sale's over. }, );
      The question from this is: what is the cost of item #1 for a purchase made on March 18th, 2005? For that, you have to scan through looking for the last date key that is less than or equal to the desired date.

      If, however, you had an ordered hash, you could do a binary search for it. Or if you had a tree or something where the date keys were stored in an order, you could look it up faster.

        That is exactly the problems. Thanks Tanktalus ;-)
Re: Date Ranges Question
by graff (Chancellor) on Oct 26, 2005 at 04:01 UTC
    I'm not sure how important the "pricelistname" is to solving the problem... do you really need that at all? Given the nice rephrasing of the problem by Tanktalus, what might work for you is something like this:
    my %partprice = ( PN01 => { 2005-01-01 => 29.95, 2005-01-05 => 20.00, 2005-01-19 => 29,95, 2005-03-01 => 33.50 }, PN02 => { ... }, ... ); # pick a part number and a date of sale: my $pn = 'PN01'; my $saledate = '2005-02-14'; my $price = getPrice( $saledate, $partprice{$pn} ); # do something with $price ... sub getPrice { my ( $sdate, $prices ) = @_; # 2nd param is a hashref my @keydates = sort keys %$prices; while ( @keydates and $sdate lt $keydates[$#keydates] ) { pop @keydates; } # at this point, either the sale predates the price history (@keyd +ates is empty) # or else the last element of @keydates points to the correct pric +e for that date return ( @keydates ) ? $$prices{pop @keydates} : undef; }
    I haven't tested that at all (except to confirm that using "pop @array" as the hash key index does work); still I hope the idea is clear enough.
Re: Date Ranges Question
by Skeeve (Parson) on Oct 25, 2005 at 21:26 UTC

    Okay... If you don't have it in a DB, store the dates ordered in an array and make a binary search for the correct date. That way you don't have O(n) but just O(log2(n))

    Use the date as a hash key for finding the other data.


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Date Ranges Question
by GrandFather (Saint) on Oct 26, 2005 at 10:06 UTC

    Here's a modified binary search that may help with the problem if there are lots of price changes per part:

    use strict; use warnings; my %partprice = ( 1 => { '2005-01-01' => 1.99, # new year special. '2005-01-02' => 8.99, # back to normal. '2005-02-15' => 9.99, # price bump. '2005-04-01' => 9.49, # on sale. '2005-04-08' => 10.99, # sale's over. }, ); my %hPrices = %{$partprice{1}}; my @aPrices = map {"$_ $hPrices{$_}"} sort keys %hPrices; print ((join "\n", @aPrices) . "\n"); my $date = '2005-01-00'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; $date = '2005-01-01'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; $date = '2005-01-02'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; $date = '2005-01-03'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; $date = '2005-02-16'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; $date = '2005-04-07'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; $date = '2005-04-08'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; $date = '2005-04-09'; print "The price on the $date was " . FetchPrice (1, $date) . "\n"; sub FetchPrice { my ($partnbr, $date) = @_; my %hPrices = %{$partprice{$partnbr}}; my @aPrices = map {[$_, $hPrices{$_}]} sort keys %hPrices; return $aPrices[0]->[1] if $date le $aPrices[0]->[0]; return $aPrices[-1]->[1] if $date ge $aPrices[-1]->[0]; my $posmin = 0; my $posmax = $#aPrices; while (1) { my $mid = int (($posmin + $posmax + 1) / 2); if ($aPrices[$mid]->[0] lt $date) { return $aPrices[$mid - 1]->[1] if $mid == $posmin; $posmin = $mid; } elsif ($aPrices[$mid]->[0] gt $date) { return $aPrices[$mid - 1]->[1] if $mid == $posmax; $posmax = $mid; } else { return $aPrices[$mid]->[1]; } } }

    Perl is Huffman encoded by design.
Re: Date Ranges Question
by Moron (Curate) on Oct 27, 2005 at 10:02 UTC
    For data which doesn't change regularly enough to warrant implementation via a time-series DBMS, we use a start-date end-date approach in the data modelling, so that in the example case the price on a given date for a product is given by a query like:
    SELECT xxx_price FROM price_history WHERE product_id = <query-product-id> AND <query-date> BETWEEN start_date AND end_date
    Update: and the code for changing the price updates the entry with an end-date of 31st Dec 9999 (the existing current price record), to an end-date equal to the cessation date inserts a new record with the new price, a start date of the commencement date and the same 31/12/9999 end date as before. The cessation date is the day before the new commencement date unless the datetime is meant to be intraday.

    The logical primary key will include the product id, start date and end date, although in practice an artificial primary key of e.g. entry_id may be used for performance purposes.

    -M

    Free your mind