in reply to an efficient program?

Hi again
Apologies for not making my problem clear. I shall try again.
The database table looks something like this
id someotherdata groupcode 1 blah wek134 2 foo htw456 3 baa twe546 4 soc wek134
In a nutshell I want to create a list that for id 1 lists all other entries without the groupcode wek134, and then for id 2 lists all other entries without the groupcode htw456 and so on
The actual analysis comes later. This is just to get the data in the correct format.
Thanks a lot

Replies are listed 'Best First'.
Re^2: an efficient program?
by dsheroh (Monsignor) on May 26, 2006 at 03:27 UTC
    Ouch. Just... ouch.

    If that's really what you want/need to do, then Errto's pretty well nailed it - you're describing a slightly-filtered Cartesian join. You'll get the least-bad performance by doing it in the database with SQL similar to what he suggested, but I'd advise you to first take a hard look at whether there is any other way to accomplish your actual goal, because, if you have a lot of groupings and you combine each record with all records in every other grouping, then you're looking at having to process billions of combinations.

Re^2: an efficient program?
by roboticus (Chancellor) on Jun 01, 2006 at 11:41 UTC
    Angharad:

    A cartesian-product like this may generate a huge (read that unreadable and unusable) report if you have many rows. If you don't have many rows, then it's not a tough problem anyway ;^).

    Perhaps an alternative report structure might be nicer. Instead of the id-based cartesian product, you might do a cartesian product of the group codes and display them as a table. Then for each cell of the table, you can have a separate cartesian product as another page in your report. Then you could glue it together with some perl, something like (untested, incomplete):

    ##### # Header page ##### print "Group vs. Group summary\n" . "-----------------------\n\n" . "Grp Grp Stat1 Stat2 ...\n" . "---- ---- ------ ------ ------\n"; # Do SQL query here to generate cross-tab for Group pairs: my $S=$DB->prepare( "SELECT a.groupcode, b.groupcode, /* compute stats */ " ."FROM my_table a, my_table b " ."WHERE a.entry_code != b.entry_code " ."GROUP BY a.groupcode, b.groupcode " ."ORDER BY a.groupcode, b.groupcode" ) or die; $S->execute or die; for (my $ar=$S->fetchrow_arrayref) { # print junk # Store the pair of group codes. (I'm assuming that the # stats are symmetric, so only push group code pairs once # so we don't get two pages of details for each pair.) # (?Syntax OK? I've never tried pushing an arrayref onto # an array before...) push @grp_pairs, \($ar[0], $ar[1]) if $ar[0] < $ar[1]; } ##### # Detail view pages ##### $S = $DB->prepare( "SELECT a.entry_id, a.some_other_col, b.entry_id, " ." b.some_other_col " ."FROM my_table a, my_table b " ."WHERE a.entry_code=? and b.entry_code=? " ."ORDER BY a.entry_id, b.entry_id" ) or die; # (?Syntax again? probably wrong and needs fixing) while (my ($grp1, $grp2) = pop @grp_pairs) { print $FormFeed . "DETAILS for $grp1 vs. $grp2\n\n" . "A.ID A.ColB B.ID B.ColB Stat1 Stat2 ...\n" . "---- ------ ---- ------ ----- ----- ------\n"; # Generate group vs. group details $S->execute($grp1, $grp2) or die; for (my $ar=$S->fetchrow_arrayref) { # print junk... } }
    --roboticus