in reply to Nested loops?
Alright, let's see if I can come up with a better explanation... The working code was written by someone else, so there are a couple of things (a lot of things) I don't quite understand... Like in the foreach statement that defines my $sql1 (@{$sql1}). earlier in the script, $sql1 is defined as my $sql1 = $lib_dbh->selectall_arrayref($pull1). $pull1 is the select statement for SQLite. The second select statement ($sql2) uses two other tables to match elements and create a list of "target" sequences.
The data stored in the first $sql1 ($lib_dbh->selectall_arrayref($pull1)) is as such:
table 1
55436, atcgtggtcgtgt
56875, agtcgtagtctaa
56789, tgatgcgtctatc
23698, atcgtgctcgtgt
75699, tgatgcttctatc
87226, atcgtgatcgtgt
12214, agtcgttgtctaa
etc.
The data in the second table would be the same, except with only the filtered target sequences.
table2
55436, atcgtggtcgtgt
56875, agtcgtagtctaa
56789, tgatgcgtctatc
etc.
The foreach loops containing "$table1{$sql1->[1]}{$sql1->[0]}=undef;" then rearranges the tables to have the sequence first, and the id's second. (I don't know why, but that's the way it is set up. I have to work within the constraints of the original programmer so as not to break any of the follow on scripts.)
my $pull1 = "SELECT id, seq from table"; my $pull2 = "SELECT id, seq from table where table.id in(long string o +f nested "in" criteria); my $sql1 = $lib_dbh->selectall_arrayref($pull1); my $sql2 = $lib_dbh->selectall_arrayref($pull2); foreach my $sql1 (@{$sql1}) { $table1{$sql1->[1]}{$sql1->[0]}=undef; } foreach my $sql2 (@{$sql2}) { $table2{$sql2->[1]}{$sql2->[0]}=undef; } my @bases = ('A','C','G','T'); Label: foreach my $x (keys %table1){ if (exists $table2 ({$x})) { my $found_alt = 0; my @storage_array = (); @{$storage_array[1]} = keys %{$table1{$x}}; foreach my $bases (@bases) { my $alt = $x; substr($alt, 6, 1) = $opt; next if ($alt eq $x); if (exists $table1{$alt}) { $found_alt = 1; push @{$storage_array[2]}, keys %{$table1{$alt}}; } } next Label unless ($found_alt ); #continues to follow on script.
The output ends up being an array (with much more columns from the rest of the script)containing id's in [1] and [2].
Using the example data, I would want it to start with the sequence "atcgtggtcgtgt" from table 2. (id 55436.) It would then look through table 1 to see if it exists. (It does, and always will since it is a subset.) It then takes the sequence from table 1 and augments it with the 4 bases (my @bases = ('A','C','G','T');) at the 7th position (substr($alt, 6, 1) = $opt;).
It skips the augmented sequence that matches the original sequence, and then iterates over table 1 for the remaining three sequences. (atcgtgAtcgtgt, atcgtgCtcgtgt, atcgtgTtcgtgt). (I capitalized them for emphasis only.) Each time it finds a match, it stores the id to [2] in the array. [1] holds the original id.
For the example dataset, the array output would be something like:
[0]1 [1]55436 [2] 23698, 87226
[0]2 [1]56875 [2] 12214
[0]3 [1]56789 [2] 75699
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Nested loops?
by poj (Abbot) on Aug 18, 2017 at 19:39 UTC | |
|
Re^2: Nested loops?
by shmem (Chancellor) on Aug 18, 2017 at 18:58 UTC |