As for me, I don't believe your benchmarking for a simple reason: databases are definitely better than Perl at crunching tabular data, provided that you use them correctly.
Darn - and I had so enjoyed the unanimity of the "yes, that's what you should expect" crowd so far.
Anyway - no matter. And no problem with the challenge. I've been using vanilla test data anyway. Here it is:
DB 0
This produces rows that look like:$num_respondents = 20000; $offset = 0; for $respondent (1 .. $num_respondents) { $sql = "INSERT INTO theTable VALUES ($respondent"; for $response (1 .. $offset) { $sql = $sql.", 0"; } $response = $offset + 1; while ($response < $num_responses+1) { $sql = $sql.", 1"; if (($num_responses - $response) > 4) { $sql = $sql.", 0, 0, 0, 0"; } else { for $response ($response+1 .. $num_responses) { $sql = $sql.", 0"; } } $response = $response + 5; } if ($respondent % 2) {$sql = $sql.", 1, 0";} else {$sql = $sql.", 0, 1";} $sql = $sql.")"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh- +>errstr); print "<p>$sql\n"; $sth->execute() or die("Could not execute!" . $dbh->errstr); if ($offset < 4) {$offset++;} else {$offset = 0;} }
so that each respondent gave every fifth response and each response was given by every fifth respondent, except for the last two which represent gender - each of those was given by 1/2 the respondents and every benchmark query uses one of the gender columns.respondent_1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0... respondent_2, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0... respondent_3, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0...
DB 1 & DB 2
This produces:$sql = "INSERT INTO theTable (response, respondent) VALUES (?,?)"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh->err +str); $i = 1; for $respondent (1 .. 20000) { for $r (0 .. 600) { $response = $i + ($r * 5); $sth->execute($response, $respondent) or die("Could not ex +ecute!" . $dbh->errstr); } if ($respondent % 2) { $response = 9999; $sth->execute($response, $respondent) or die("Could not ex +ecute!" . $dbh->errstr); } else { $response = 6666; $sth->execute($response, $respondent) or die("Could not ex +ecute!" . $dbh->errstr); } if ($i < 5) {$i++;} else {$i = 1;} }
so that, again, each respondent gave every fifth response and each response was given by every fifth respondent, except for the gender codes which are half-half.1, 1 6,1 11, 1 ... 2, 2 7, 2 12, 2 ... 3, 3 8, 3 13, 3 ...
DB 3
This produces tables that look like:$respondents_by_5 = 4000; $sql = "SHOW TABLES"; $sth = $dbh->prepare($sql) or die("Could not prepare!" . $dbh->err +str); $sth->execute() or die("Could not execute!" . $dbh->errstr); $j = 1; while ($table = $sth->fetchrow_array()) { if ($table ne 't6000' && $table ne 't9000') { $sql = "INSERT INTO ".$table." VALUES (?)"; $std = $dbh->prepare($sql) or die("Could not prepare! At t +able = $table because " . $dbh->errstr); for $i (0 .. $respondents_by_5) { $respondent = $j + ($i * 5); $std->execute($respondent) or die("Could not execute!" + . $dbh->errstr); } if ($j < 5) {$j++;} else {$j = 1;} } } $sql = "INSERT INTO t9999 VALUES (?)"; $stf = $dbh->prepare($sql) or die("Could not prepare! At table = $ +table because " . $dbh->errstr); $sql = "INSERT INTO t6666 VALUES (?)"; $stm = $dbh->prepare($sql) or die("Could not prepare! At table = $ +table because " . $dbh->errstr); for $respondent (1 .. 20000) { if ($respondent % 2) { $stm->execute($respondent) or die("Could not execute!" . $ +dbh->errstr); } else { $stf->execute($respondent) or die("Could not execute!" . $ +dbh->errstr); } }
one row per table entry.Table_1:1, 6, 11, 16, ..., 19996 Table_2:2, 7, 12, 17, ..., 19997 Table_3:3, 8, 13, 18, ..., 19998 Table_4:4, 9, 14, 19, ..., 19999 Table_5:5, 10, 15, 20, .., 20000 Table_6:1, 6, 11, 16, ..., 19996 etc.
DB 4
where "write_lines_to_file" is$j = 1; for $r (1 .. $num_responses) { @lines = (); $table = ">".$db_dir."/r".$r; for $i (0 .. $respondents_by_5) { $respondent = $j + ($i * 5); push (@lines, $respondent); } &write_lines_to_file(1, $table, @lines); if ($j < 5) {$j++;} else {$j = 1;} } for $respondent (1 .. 20000) { if ($respondent % 2) { push(@lines_6666, $respondent); } else { push(@lines_9999, $respondent); } } $table = ">".$db_dir."/r6666"; &write_lines_to_file(1, $table, @lines_6666); $table = ">".$db_dir."/r9999"; &write_lines_to_file(1, $table, @lines_9999);
This produces the same tables as DB3, but each in its own text file.sub write_lines_to_file { my $add_line_return = shift; my $file = shift; my @lines=@_; open(FILE, ">$file") or dienice("cannot open file : $!"); $STDOUT_Handle = select(FILE); for my $i (0 .. $#lines) { print "$lines[$i]"; if ($add_line_return) {print "\n";} } close(FILE); select($STDOUT_Handle); }
If you can see something that alters my understanding, I'm all ears.
In reply to Re: Basic Perl trumps DBI? Or my poor DB design?
by punch_card_don
in thread Basic Perl trumps DBI? Or my poor DB design?
by punch_card_don
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |