kiat has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I've data in an Excel format and I would like to feed that data to mysql. I just discovered Spreadsheet::ParseExcel and found it to be really cool.

What I intend to do is to parse the excel file using ParseExcel and then have the perl script write the necessary sql statements into a source file. For instance, let's say I've two rows of data from Excel as follows:

id joke created owner 1 joke one 2004-04-12 john 2 joke two 2004-04-11 nick
The output from the perl script (using ParseExcel) will be saved to a insert source file and it will look something like:
INSERT INTO JOKES VALUES('1', 'joke one', '2004-04-12', 'john'); INSERT INTO JOKES VALUES('2', 'joke two', '2004-04-11', 'nick');
I will then use mysql command 'source name_of_file' to insert the data into a table.

I suppose I could have the script connect to the database and directly insert the data to the relevant table, thereby bypassing the step to create a mysql source file.

Am I on the right track? Is there perhaps a more direct conversion path from Excel to mysql?

Cheers and many thanks in advance :)

Update: Great thanks to all for your solutions and ideas!

Replies are listed 'Best First'.
Re: From excel to mysql...
by tachyon (Chancellor) on Apr 13, 2004 at 14:51 UTC

    Am I on the right track? Is there perhaps a more direct conversion path from Excel to mysql?

    There is. Just save the data as a tab separated txt file from within Excel. This is a standard save as option. Then do this in mysql client after you have created the table (which you can also do in mysql client):

    mysql> LOAD DATA LOCAL INFILE '/path/to/datafile.txt' INTO TABLE table +_name;

    It will load the data at a rate > 10,000 records a second so it should not take too long. The only reason to do it the slow way using DBI and DBD::mysql is if you need to do some on the fly data munging in transit.

    You would create the table in the mysql client like:

    mysql> CREATE TABLE jokes ( id int not null, name char(32), timestamp +date, joker char(32) );

    You can use ALTER TABLE to add keys after the import or create the keys with the table. If your fields are longer than the allocated space they will be uncerimoniously truncated to fit so allocate enough room. Note char() is faster than varchar() per se but uses more space.

    cheers

    tachyon

      Wow, thanks! I've never dreamt it could be so easy :)

      Btw, what if some of the columns have empty values? Would it still work. I'll know it when I try out later...

        Welcome to the non M$ world ;-) The is a substantial upside to not rotating around the GUI.

        Anyway with empty fields yes it *should* still work. Excel dumps empty fields as

        field1\t\tfield3\tfield4

        So when mysql splits on \t you get an undef (NULL) value. In that case add "default 'blah'" to the create ie:

        timestamp date default '1970-01-01', joker char(32) default 'anon',

        and mysql will fill in the blanks.

        If your data is really dirty you may wish to parse the tab sep file with perl, check for whatever then write out all the clean data to a fresh tab sep file before you import. You could flag the errors, write them to a tab sep file and then open and fix it in Excel or just automate it.

        One thing to be aware of is that you probably will want a primary key on id (so id must be not null and must also be unique). The other key/index options you will probably want is an INDEX on some/all of the other fields depending on how you plan to search it.

        cheers

        tachyon

      Tab delimited?! Ew! That just makes it difficult to read records that span lines.

        Actually impossible from the point of view of an import utility:

        LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt' [REPLACE | IGNORE] INTO TABLE tbl_name [FIELDS [TERMINATED BY '\t'] [[OPTIONALLY] ENCLOSED BY ''] [ESCAPED BY '\\' ] ] [LINES [STARTING BY ''] [TERMINATED BY '\n'] ] [IGNORE number LINES] [(col_name,...)]

        Embedded newlines make it hard to define what represents a 'row'. Fortunately AFAIK you can't insert newlines into an Excel cell string as it triggers moving you to a new cell.

        cheers

        tachyon

      There is also a command line program called mysqlimport which is delivered with mysql. It is a front end for tachyon's excellent suggestion. It doesn't seem to have a man page, but you can find it in the mysql info pages - info mysql.

      After Compline,
      Zaxo

Re: From excel to mysql...
by dragonchild (Archbishop) on Apr 13, 2004 at 14:44 UTC
    Personally, I would use DBD::Excel (or DBD::CSV) to read the data, then immediately write into DBD::mysql using INSERT statements.

    Of course, you could save the Excel file as a CSV and use whatever loader(s) MySQL provides to load the CSV file.

    That said, whatever ends up working is the best thing to do.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

Re: From excel to mysql...
by diotalevi (Canon) on Apr 13, 2004 at 14:45 UTC
    Direct inserting is obviously easier and safer - you can write your SQL using DBI's placeholders instead of having to quote everything. When creating that SQL text file, be sure to use DBI's $dbh->quote( $val ) method.
      Thanks diotalevi!

      I've problems binding the values from the parsed Excel file to the DBI execute command. How exactly do I bind those values?

      use strict; use Spreadsheet::ParseExcel; use Data::Dumper; my $source_file = 'C:/mydocu~1/db/jokes.xls'; my $source_excel = new Spreadsheet::ParseExcel; my $source_book = $source_excel->Parse($source_file); my $dbh = do_connect(); my $sql = "INSERT INTO mytest VALUES(?,?,?,?); my $sth = $dbh->prepare($sql); # code lifted from a site... my($iR, $iC, $oWkS, $oWkC); foreach my $oWkS (@{$source_book->{Worksheet}}) { for(my $iR = $oWkS->{MinRow} ; defined $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ; $iR++) { for(my $iC = $oWkS->{MinCol} ; defined $oWkS->{MaxCol} && $iC <= $oWkS->{MaxCol} ; $iC++) { $oWkC = $oWkS->{Cells}[$iR][$iC]; print "( $iR , $iC ) =>", $oWkC->Value, "\n" if($oWkC); + } } } # $sth->execute(var1, var2, var3, var4); # I know the above line has to be somewhere in the foreach loop but I +just can't figure how :(
      cheers,
Re: From excel to mysql...
by jZed (Prior) on Apr 13, 2004 at 14:54 UTC
    In addition to the advice about DBD::Excel and DBD::CSV, you may also get good mileage with DBD::ODBC since Excel is an ODBC data source. I advise testing a bit with each method. If you have things liked embedded carriage returns in your cells and they don't get converted correctly, switch DBDs. If your data is regular (by M$ standards), any of the DBDs should work.
Re: From excel to mysql...
by Anonymous Monk on Jun 08, 2017 at 08:31 UTC
    Hi, Can you provide the above sample code ?Thanks for help in advance
Re: From excel to mysql...
by Anonymous Monk on Jun 08, 2017 at 08:33 UTC
    Hi Kiat, Can you provide above code for reference Thanks in Advance, Anusha

      Please read the thread. It provides much advice on how to write the code to import the data.

        What I want is to parse the excel file using ParseExcel and then have the perl script write the necessary sql statements into a outputfile for example insert statements.