in reply to Re: How to optimise " csv into mysql " using TEXT:CSV_XS and DBD::Mysql
in thread How to optimise " csv into mysql " using Text:CSV_XS and DBD::Mysql

thank you for the hint, however, I tried to use LOAD DATA prior to this solution, but as input data is very dirty, I wanted to have more control on whats being imported. But maybe I can try to remove the garbage and then use LOAD DATA.
  • Comment on Re^2: How to optimise " csv into mysql " using TEXT:CSV_XS and DBD::Mysql

Replies are listed 'Best First'.
Re^3: How to optimise " csv into mysql " using TEXT:CSV_XS and DBD::Mysql
by Corion (Patriarch) on Aug 06, 2015 at 09:25 UTC

    This is the approach I use with a different database which basically can only do bulk loads, because single INSERTs are very slow. The approach is:

    1. Read input data using Text::CSV_XS
    2. Perform cleanup on each row
    3. Add relevant columns (source_file, source_date)
    4. Write output TSV file in the correct encoding (Text::CSV_XS, this time writing tab-separated output)
    5. Issue bulk load statement
    6. Check that the number of rows read in 1, written in 4 and retrieved from the database are as expected

      As I don't need to add/delete columns, I guess I just can make a cleanup using INFILE edit as I'm doing now and then LOAD INTO as you proposed. There is no real added value of using TEXT:CSV_XS, or am I missing something? Actually, if I look to my code, I'm using TEXT::CSV_XS for no reason :). Maybe just wanted to be fancy :)