cosmicperl has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,
  I have a database where I need to update a load of tables that have a column named 'date' with values such as 11:11:2007, 12:11:2007, etc, to a universal integer value such as 20071111, 20071112, etc. I'm trying to figure out what the quickest and most efficient way to do this, as the database is BIG (in fact there are several versions of the database and some of them are very big).
  I've got code that increments the old date style, and have the first date for each table, so I can easily make a loop that starts at 01:01:2007 and increments each time until it gets to the current date.
  Something like:-
until ($date eq $today) { $dbh->do("update table set date=20070101 where date='01:01:2007'") +; $date = nextdate($date); }

Is going to take forever, and probably time out through CGI (to long a story as to why it would be called as CGI).
Help greatly appreciated

Lyle

Thanks to everyone, your replies have been most helpful

Replies are listed 'Best First'.
Re: Perl and MySQL update question
by saberworks (Curate) on Nov 19, 2007 at 04:35 UTC
    If it is going to take longer than a few minutes, you're going to have a really hard time making it work. Even if the server doesn't time out (which it probably will), browsers have their own timeouts as well. The other thing you have to consider is that even if the browser times out, it's possible that the command will continue to execute on the server even after the browser errors out.

    The most reliable way to do this would be to write a regular script and execute that.

    As for performance tips - add an index on all the date columns, that will severely speed up the lookups (but also slowdown the inserts -- but I bet it will be faster regardless). I remember there being an option in MySQL to temporarily turn off rebuilding indexes (so updates can happen faster), but I can't find it now.

    If you really, really, really need to do this over a cgi script, you can use apache::session or something to keep track of exactly which rows need to be updated, and then do them a few hundred at a time, returning to the browser with a meta-refresh in the <HEAD> section. The old UBB bulletin board we used years ago used to do that. It's not perfect but maybe it will work for you.
Re: Perl and MySQL update question
by sids (Acolyte) on Nov 19, 2007 at 04:42 UTC
    You didn't mention which database you are using. In MySQL, the following single SQL update should do the job:
    update `table` set `date`=CONCAT(SUBSTR(`date`,7,4), SUBSTR(`date`,4,2 +), SUBSTR(`date`,1,2));
    Something similar should be possible in other databases too.

    --
    Ignorance killed the cat, curiosity was framed.
      MySQL is implied by the title. Moreover, you're assuming that the date field is a textual one, which could not be the case, and the OP is asking for an int field. andreas1234567's approach seems more robust.

      Flavio
      perl -ple'$_=reverse' <<<ti.xittelop@oivalf

      Io ho capito... ma tu che hai detto?
        Missed the (obvious) MySQL reference in the title, sorry.
        The data in the date field is in the format dd:mm:yyyy, which indicates that it is _not_ a datetime field (so it must be a text field). In this case andreas1234567's approach would not work.

        --
        Ignorance killed the cat, curiosity was framed.
Re: Perl and MySQL update question
by andreas1234567 (Vicar) on Nov 19, 2007 at 06:40 UTC
    First approach (date -> int):
    mysql> select version(); +-----------+ | version() | +-----------+ | 4.1.20 | +-----------+ 1 row in set (0.00 sec) mysql> create database cosmicperl; Query OK, 1 row affected (0.00 sec) mysql> use cosmicperl; Database changed mysql> create table cosmicperl (id integer auto_increment primary key, + dt date not null); Query OK, 0 rows affected (0.01 sec) mysql> insert into cosmicperl values (null, now()),(null, now()-interv +al 1 day),(null, now()-interval 7 day); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> alter table cosmicperl add column dt_int integer unsigned; Query OK, 3 rows affected (0.01 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> update cosmicperl set dt_int=DATE_FORMAT(dt,'%Y%m%d'); Query OK, 3 rows affected (0.00 sec) Rows matched: 3 Changed: 3 Warnings: 0 mysql> select * from cosmicperl; +----+------------+----------+ | id | dt | dt_int | +----+------------+----------+ | 1 | 2007-11-19 | 20071119 | | 2 | 2007-11-18 | 20071118 | | 3 | 2007-11-12 | 20071112 | +----+------------+----------+ 3 rows in set (0.00 sec) mysql>

    Second approach (varchar -> int):

    Update: Added second approach as pr CountZero's suggestion. I would strongly recommend to use date/timestamp datatypes when storing date and time values in order to benefit from consistency checks and built-in date arithmetic.

    --
    Andreas
      Nice, but the OP indicated that his "date" field is in the format "dd:mm:yyyy" so it seems not of type date and then your solution will not work.

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Perl and MySQL update question
by aquarium (Curate) on Nov 19, 2007 at 05:42 UTC
    "a load of tables that have a column named 'date'"....sounds a bit like a bagful of tables rather than a database. you should consider storing dates in provided database native type suitable. storing dates in text fields assures problematic records.
    the hardest line to type correctly is: stty erase ^H
Re: Perl and MySQL update question
by igelkott (Priest) on Nov 19, 2007 at 09:17 UTC
    I'll presume that you don't intend to use literal strings in your SQL updates; it would of course be terrible pain to change one date at a time like this.

    If you have good enough access to the databases (directly or through a DBA), it would probably be good to use a real date column. Assuming MySQL, your dates can be converted with something like: update table set realdate = STR_TO_DATE(date, '%d:%m:%Y'). This will make dates like "2007-11-21". You can have you application strip the "-" signs in Perl ($date =~ y/-//d) or SQL (DATE_FORMAT(realdate,'%Y%m%d')). Later, the "realdate" column could replace the old date column.

    If you can't create a new column or just don't want it, you could run the conversion internally like: update table set date = DATE_FORMAT(STR_TO_DATE(date,'%d:%m:%Y'),'%Y%m%d'). It seems a bit inefficient but it's easy to write and might be OK if your DB server is fast. The danger of going through the date functions is that the dates have to be valid (eg, not 31:02:2007) so I'd run a select (select to file if possible) before trusting an update.

    For speed, leaving off any WHERE statements should help. If you still can't get this done in time, use WHERE to split the updates by year or even month-year with something like where date like '%11:2007'.

    If you aren't using MySQL, the internal date format might be different but the SQL commands will probably be the same (adjusting for quoting style).

Re: Perl and MySQL update question
by scorpio17 (Canon) on Nov 19, 2007 at 19:34 UTC

    Another idea:

    Dump the database tables to a file, process the file with a script offline, then load the modified data back into the database. This just uses the same commands you would normally use to create/restore a database backup.