http://qs1969.pair.com?node_id=362699


in reply to faster with threads?

Updates are slow, in my experience (at least with Oracle) putting data in a database using inserts is way slow.

If i were to build your system there are a couple of things i would do:

  1. Research on what you think is the best item.
  2. Develop prototypes for all appropriate options
  3. Run the prototypes against a limited data set
  4. Extrapolate the timings
  5. Develop further the winner.

Replies are listed 'Best First'.
Re^2: faster with threads?
by jaa (Friar) on Jun 09, 2004 at 13:03 UTC

    Use the mysql bulk insert statement format if possible.

    We found that using the traditional prepare/execute cycle was about 10 times slower to insert 2 million records, than using several bulk sql statements of about 500K.

    my @rows = ( [qw(a b c d)], [qw(e f g h)], [qw(i j k l)], [qw(m n o p)], [qw(q r s t)], [qw(u v w x)], [qw(y z a b)], [qw(c d e f)], [qw(h i j k)], [qw(l m n o)], ); # ------------------------------------- # slower # ------------------------------------- my $sth = $db->prepare("INSERT INTO mytable VALUES(?,?,?,?)") or die "prepare failed: " . $db->errstr; for my $row ( @rows ) { $sth->execute(@$row); } # ------------------------------------- # faster # ------------------------------------- my $sql = "INSERT INTO mytable VALUES "; for my $row ( @rows ) { $sql .= "('" . join("','",@$row) . '),"; } substr($sql,-1)=''; $db->do( $sql );
      And if using a bulk insert isn't possible for some reason, first drop the indices on the table(s) involved, insert your data, then rebuild the indices. Unless the tables already contain billions of rows, rebuilding the indices is faster than modifying them millions of times.

      Abigail