Re: Nested loops?
by stevieb (Canon) on Aug 17, 2017 at 19:27 UTC
|
If I'm understanding you correctly, exists to the rescue!
use warnings;
use strict;
my %one = (a => 1, b => 2, c => 3); # first hash
my %two = (a => 1, c => 3); # second hash
for my $x (keys %two){
if (exists $one{$x}){
print "hash \$two key $x exists in hash \$one\n";
}
}
Output:
hash $two key c exists in hash $one
hash $two key a exists in hash $one
In other words, it iterates over the second hash, checking if the "filter" key is in the first hash allowing you to perform some actions, otherwise the loop will just skip to the next iteration. A major benefit here is that it only iterates over a single hash... the smallest one, which completely avoids looping over one hash in its entirety, then a second entire hash for every key in the first. | [reply] [d/l] [select] |
|
|
I'm currently trying out exists, thanks for that! In the example above, I would have to put the "filter" keys in the exists, because the $x becomes the sequence that gets augmented and compared to all $x's. So, if the key in %two exists as a key in %one, then the key from %one becomes the target. Which then gets assigned to a new variable with each option from the base list. Those are then checked against the $x keys for matches.
foreach my $sql1 (@{$sql1}) {
$table1{$sql1->[1]}{$sql1->[0]}=undef; #rearranges the sql pull -
+ large table of sequences and id's - These get used to create the alt
+s and house the entire list that needs to be searched
}
foreach my $sql2 (@{$sql2}) {
$table2{$sql2->[1]}{$sql2->[0]}=undef; #rearranges the sql pull -
+ "filter" table of sequences and id's - Only used to dictate which se
+quences get used from the large table
my @bases = ('A','C','G','T');
foreach my $x (keys %table1){
if (exists $table2 ({$x})) {
my $found_alt = 0;
foreach my $bases (@bases) {
my $alt = $x;
substr($alt, 20, 1) = $opt;
next if ($alt eq $x);
if (exists $table1{$alt}) {
$found_alt = 1;
| [reply] [d/l] |
Re: Nested loops?
by kcott (Archbishop) on Aug 18, 2017 at 08:09 UTC
|
G'day Speed_Freak,
Firstly, there's all sorts of problems with your post:
-
No sample input data.
-
No expected output.
-
Incomplete code: it won't run; we can't test.
-
Use of identically named variables for different things, e.g. $sql1.
-
Declaration and initialisation of variables that are not subsequently used, e.g. @bases.
Your follow-up response suffers from much the same problems.
Please read "How do I post a question effectively?".
Aim to provide us with an SSCCE.
In general, you should just create a single hash from the smaller dataset (your seq2?);
then iterate the larger dataset (your seq1?)
processing this raw data based on the single hash created.
The following code demonstrates the technique:
#!/usr/bin/env perl -l
use strict;
use warnings;
my $all_aref = [ [qw{id1 seq1}], [qw{id2 seq2}], [qw{id3 seq4}] ];
my $filter_aref = [ [qw{id1 seq1}], [qw{id3 seq3}] ];
my %filter_hash;
$filter_hash{$_->[0]}{$_->[1]} = 1 for @$filter_aref;
for (@$all_aref) {
if (exists $filter_hash{$_->[0]}) {
print "ID $_->[0] in filter";
if (exists $filter_hash{$_->[0]}{$_->[1]}) {
print "SEQ $_->[1] in filter for ID $_->[0]";
}
else {
print "SEQ $_->[1] not in filter for ID $_->[0]";
}
}
else {
print "ID $_->[0] not in filter";
}
}
Output:
ID id1 in filter
SEQ seq1 in filter for ID id1
ID id2 not in filter
ID id3 in filter
SEQ seq4 not in filter for ID id3
Note that I strongly emphasised "demonstrates the technique"
because this is not intended to be any sort of solution.
Not knowing what the input looks like, how it should be processed, or what sort of output is required,
a solution is not possible at this time!
| [reply] [d/l] [select] |
Re: Nested loops?
by Laurent_R (Canon) on Aug 18, 2017 at 06:19 UTC
|
It is quite difficult to understand your code because we have no idea of the contents of $sql1 and $sql2.
It would be very useful if you provided sample input data for those, much in the way stevieb dit it in his answer.
| [reply] |
Re: Nested loops?
by chacham (Prior) on Aug 18, 2017 at 13:30 UTC
|
Why are you doing a SQL task in perl? Just let the database do the whole thing for you. It's faster, an will save on network traffic.
Please post the queries so we can have a look at combining them.
| [reply] |
Re: Nested loops?
by zakame (Pilgrim) on Aug 23, 2017 at 17:24 UTC
|
I only skimmed through this, but for that first foreach, you should have something %seen in place just before entering the loop, so it takes id/sequence as a key to skip alternates in the loop (next if %seen{$sequence}.)
Also, I'm probably wrong, but the way you describe your process sounds like a gather/take from Perl6. Here's some Perl5-ish implementation using Syntax::Keyword::Gather:
use Syntax::Keyword::Gather;
my @sequences = ( ... );
my @filters = ( ... );
my @primary_and_filtered = gather {
my %seen;
for my $seq (@sequences) {
take $seq unless $seen{$seq};
take map { $_->($seq) } @filters;
$seen{$seq}++;
}
};
Note that the @filters doesn't correspond to your described filters list of sequences, but rather, a list of filter functions (e.g. another permute, or a more specific search, etc.) to evaluate your original primary sequence against. | [reply] [d/l] [select] |
|
|
Thanks for the response! I was looking into how %seen works and realized where my code was wrong
I added a line above the loop: my %table2a = keys %table2 and then changed my if statement: if (exists ($table2a {$x}))
Now it's working just as expected. You'll also notice that the parentheses in the if statement changed as well.
| [reply] |
|
|
| [reply] |
Re: Nested loops?
by Speed_Freak (Sexton) on Aug 18, 2017 at 17:14 UTC
|
Alright, let's see if I can come up with a better explanation... The working code was written by someone else, so there are a couple of things (a lot of things) I don't quite understand... Like in the foreach statement that defines my $sql1 (@{$sql1}). earlier in the script, $sql1 is defined as my $sql1 = $lib_dbh->selectall_arrayref($pull1). $pull1 is the select statement for SQLite. The second select statement ($sql2) uses two other tables to match elements and create a list of "target" sequences.
The data stored in the first $sql1 ($lib_dbh->selectall_arrayref($pull1)) is as such:
table 1
55436, atcgtggtcgtgt
56875, agtcgtagtctaa
56789, tgatgcgtctatc
23698, atcgtgctcgtgt
75699, tgatgcttctatc
87226, atcgtgatcgtgt
12214, agtcgttgtctaa
etc.
The data in the second table would be the same, except with only the filtered target sequences.
table2
55436, atcgtggtcgtgt
56875, agtcgtagtctaa
56789, tgatgcgtctatc
etc.
The foreach loops containing "$table1{$sql1->[1]}{$sql1->[0]}=undef;" then rearranges the tables to have the sequence first, and the id's second. (I don't know why, but that's the way it is set up. I have to work within the constraints of the original programmer so as not to break any of the follow on scripts.)
The output ends up being an array (with much more columns from the rest of the script)containing id's in [1] and [2].
Using the example data, I would want it to start with the sequence "atcgtggtcgtgt" from table 2. (id 55436.) It would then look through table 1 to see if it exists. (It does, and always will since it is a subset.) It then takes the sequence from table 1 and augments it with the 4 bases (my @bases = ('A','C','G','T');) at the 7th position (substr($alt, 6, 1) = $opt;).
It skips the augmented sequence that matches the original sequence, and then iterates over table 1 for the remaining three sequences. (atcgtgAtcgtgt, atcgtgCtcgtgt, atcgtgTtcgtgt). (I capitalized them for emphasis only.) Each time it finds a match, it stores the id to [2] in the array. [1] holds the original id.
For the example dataset, the array output would be something like:
[0]1 [1]55436 [2] 23698, 87226
[0]2 [1]56875 [2] 12214
[0]3 [1]56789 [2] 75699
| [reply] [d/l] |
|
|
#!/usr/bin/perl
use strict;
use DBI;
use Data::Dumper;
my $n = 6;
unlink 'mytestdb.sqlite' if -e 'mytestdb.sqlite';
my $dbh = DBI->connect("dbi:SQLite:dbname=mytestdb.sqlite","","");
test_setup();
my $sql2 = "
SELECT id,seq,substr(seq,1,$n),substr(seq,-$n)
FROM testtable
WHERE id IN ('55436','56875','56789')";
my $ar = $dbh->selectall_arrayref($sql2);
my $sql3 = "
SELECT id
FROM testtable
WHERE substr(seq,1,$n) = ?
AND substr(seq,-$n) = ?
AND id != ?";
my $sth3 = $dbh->prepare($sql3);
my @output = ();
my $i=0;
for my $rec (@$ar){
$sth3->execute($rec->[2],$rec->[3],$rec->[0]);
my $others = join ',',
map { $_->[0] }
@{ $sth3->fetchall_arrayref() };
push @output,[++$i,$rec->[0],$others];
}
print Dumper \@output;
sub test_setup {
$dbh->do('CREATE TABLE testtable (id,seq)');
my $sth = $dbh->prepare('INSERT INTO testtable VALUES (?,?)');
while (<DATA>){
chomp;
my @f = split ", ",$_;
$sth->execute(@f);
}
}
__DATA__
55436, atcgtggtcgtgt
56875, agtcgtagtctaa
56789, tgatgcgtctatc
23698, atcgtgctcgtgt
75699, tgatgcttctatc
87226, atcgtgatcgtgt
12214, agtcgttgtctaa
poj | [reply] [d/l] |
|
|
Again, you post a snippet which doesn't compile, and which, if it would, doesn't help me to help you, since it depends on a datasource unavailable to me.
The data initialization stuff isn't interesting, so you could just skip that, and provide a representative subset of the anonymous hashes $slq1 and $sql2 (since that is what is relevant here), at best in a format Data::Dumper or related modules provide. Then, the foreach loop labeled with Label isn't finsished, and there's no code which does the transformation of @storage_array into the desired output you post.
So, again, I have to guess. Why do you provide the necessary information needed to help you just piecemeals? See I know what I mean. Why don't you?
perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
| [reply] [d/l] [select] |