Cagao has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I've got a pretty hefty query, searching 1.5m rows, which was taking forever in the 'sending data' phase, so I've modified things to simply insert the results into a new table (to select later with a limit), using INSERT INTO ... SELECT.

However, when watching the process list, I still see this query 'sending data' - i don't want any data sent back.

I've tried both prepare_cached and a simple 'do', which I thought would be the solution, but to no avail.

Is there any way to speed this up? apart from indices of course?

Thanks

Replies are listed 'Best First'.
Re: DBI mysql insert into select speed :-/
by perrin (Chancellor) on Jun 04, 2008 at 21:49 UTC
    The problem you're having here is that you don't really know what "sending data" means in the MySQL status field. You probably should just ignore that and look at the results of EXPLAIN instead.

    Caching the query results can be useful if you're doing paging of a very slow query, but it should be your last resort. It complicates things.

    Since this is not really Perl-related, my suggestion would be to take your query and any questions about the EXPLAIN to the MySQL users lists, which is very helpful.

      Thanks for that,

      My concern was about seeing 'sending data', which to me implies that the query has finished and mysql is now sending the results back to Perl space.

      Doing an INSERT INTO ... SELECT with a do() shouldn't send anything back, or should it?
        No, that won't send data to Perl other than whether or not the query succeeded. It's doing something else during that time, possibly copying the results into your new table.
Re: DBI mysql insert into select speed :-/
by moritz (Cardinal) on Jun 04, 2008 at 21:23 UTC
    Is there any way to speed this up? apart from indices of course?

    Some queries can be optimized even without adding indices. Impossible to tell without seeing the query, though.

    A very valuable tool is the explain command, maybe its output can inspire you.

    Do you want your script to continue working on other things during the DB query? If yes, you could take a look at threads, or perhaps there's an option for non-blocking queries in DBI.

      I've fiddled with threads before and it all ended in tears.

      Not that I want my script to do anything during the query anyway. Just want it a bit faster, it's taking 10s to finish at the minute.

        You're basically telling us that you have slow SQL, that we shouldn't tell you to get the indexes right, and that it's too slow. And you don't show us any SQL that we could help you to improve. And the problem doesn't seem to have a connection to perl at all, because it's that database that's slow, not perl.

        So what do you expect us to do? Should we say "just add the --fast option to the mysql server"?

        Sorry, there is no silver bullet to speed things up (unless buying better hardware, of course).

Re: DBI mysql insert into select speed :-/
by bilfurd (Hermit) on Jun 04, 2008 at 21:43 UTC
    Four things come to mind:

    First, depending on the MySQL host, you might be able to squeeze performance out of the machine by splitting the data sources. Let me explain:

    If the data structure is complex, combine some of the complex SELECTs into temp tables. Add indices and try to run the process again.

    Second, assuming the tables are indexed, etc., try changing your SQL statement. If your conditions are all located in the WHERE clause, try creating sub-queries to do the same thing. If you do have sub queries, reverse the process and try the conditions in the WHERE clause.

    Third, make sure that you are not pulling from multiple views. Views are great for compartmentalizing code, but have to be completely processed before any actions are taken against the records.

    The fourth thing is completely irrelevant, so forget I mentioned it.

    Granted, these are not Perl fixes; however, my biggest successes came from hitting these three things first.

    Good luck!

Re: DBI mysql insert into select speed :-/
by igelkott (Priest) on Jun 05, 2008 at 06:40 UTC

    Indices on the new table will slow down an insert, all else being equal. Delete those (if present) and you're likely to see and improvement. If that doesn't help, it probably means that the select was the rate-limiting step (hope that phrase makes sense outside chemistry and enzymology) so you should check your select more carefully. For example, change subqueries to joins, where possible.

    Of course, the right answer is the Explain as suggested earlier but here are a few other suggestions if the results too confusing.

Re: DBI mysql insert into select speed :-/
by scorpio17 (Canon) on Jun 05, 2008 at 13:17 UTC
    The fastest way I have found to bulk load large amounts of data into mysql is this:
    LOAD DATE INFILE 'file_name' INTO TABLE 'table_name';

    But this requires you to write your data into a temporary file first.
    It may be possible to avoid this by using a FIFO (named pipe) on unix.

    Here's a link to the mysql docs:
    http://dev.mysql.com/doc/refman/5.0/en/load-data.html