jdtoronto has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed monks,

I have an application that imports a CSV file into a MySQL table where all fields are varchar type. One field has a date in the format 'MM/DD/CCYY hh:mm' where leading zeroes are typically not present. This needs to be converted to something that MySQL will swallow - such as: 'CCYY-MM-DD hh:mm:ss' where all zeroes are required.

I can get the various parts of the time using split and then rebuilding the string. BUT there are typically somewhere between 7000 and 500000 records in the table.

I am assuming that doing a SELECT and getting the key and the date as text field into a HASH - then convert the date format - the do an UPDATE will be the best way of doing the change?

App is in Perl 5.8.0 using DBI/DBD-MySQL.

Your advice is sought on

jdtoronto

Replies are listed 'Best First'.
Re: Convert date format in database table.
by runrig (Abbot) on Oct 07, 2003 at 03:47 UTC
    Why wait to fix the date? Munging 500000 text records before inserting should be quicker than doing 500000 updates after inserting. And 500000 really isn't that much. One possible fix would be something like (untested):
    s#,"(\d+)/(\d+)/(\d+)\s+(\d+):(\d+)",# sprintf("%04d-%02d-%02d %02d:%02d:00", $3, $1, $2, $4, $5)#ex;
      Why use a regex when split will do? :)
      my $date = '2/3/1906'; my $new_date = sprintf("%04d-%02d-%02d 00:00:00",(split /\//, $date)[2 +,1,0]); # $new_date is "1906-03-02 00:00:00"

      cLive ;-)

      Update: As tachyon points out, I misread the question - the phrase "must be all zeroes" threw me :)

      But again, I still say split :)

      my $new_date = sprintf("%04d-%02d-%02d %02d:%02d:00",(split /\D+/, $da +te)[2,1,0,3,4]);

      :)

        So you deal with the HH:MM (which would require a split/[:/ ]/, and don't have to parse the CSV of which this date string was stated to be a field.....

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        Why use a regex when split will do?
        Because a regex allows for more precise format matching (year can be '\d{4}', the rest can be '\d{1,2}) in case, e.g., we have badly formatted dates and we don't want errors upon inserting to the database. I know, it's not necessarily a great reason, but it's a reason :-)
Re: Convert date format in database table.
by Zaxo (Archbishop) on Oct 07, 2003 at 04:00 UTC

    I would mung the CSV file using perl and one of the CSV modules, then import the whole thing to mysql, with the proper DATETIME column type for that data.

    After Compline,
    Zaxo

Re: Convert date format in database table.
by tachyon (Chancellor) on Oct 07, 2003 at 03:55 UTC

    Assuming that all you need to do is edit the text file to fix the issue then something like this should do the trick

    perl -pi.bak -e 's!,(\d+)/(\d+)/(\d+)\s+(\d+):(\d+),!sprintf ",%04d-%0 +2d-%02d %02d:%02d:%02d,", $3,$1,$2,$4,$5,0!ge' csv.txt # example while(<DATA>) { s!,(\d+)/(\d+)/(\d+)\s+(\d+):(\d+),!sprintf ",%04d-%02d-%02d %02d:%02d +:%02d,", $3,$1,$2,$4,$5,0!ge; print } __DATA__ blah,11/22/2003 24:07,blah blah,1/2/2003 24:07,blah __END__ # produces blah,2003-11-22 24:07:00,blah blah,2003-01-02 24:07:00,blah

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Convert date format in database table.
by jeffa (Bishop) on Oct 07, 2003 at 12:41 UTC