Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have an array containing pairs of numbers. I then have three other arrays containing the original numbers. I simply want to find out which array each half of the pair came from. I have this code but wondered if there was a better way of doing it?

Hope you can help!

# @gene_pairs contains the pairs of numbers e.g, 1304 1509 # @rep1_pids, @rep2_pids and @rep3_pids are where the numbers came fro +m. foreach my $pair (@gene_pairs) { my @pair = $pair; my $p = join ('', @pair); @pair = split (/\t/, $pair); + + + for (my $i=0; $i<@rep1_pids; $i++) { if ($pair[0] == $rep1_pids[$i]) { push @locations, "rep1 "; push @new_gp, "$rep1_pids[$i] "; } + + if ($pair[1] == $rep1_pids[$i]) { push @locations, "rep1 "; push @new_gp, "$rep1_pids[$i] "; } } for (my $i=0; $i<@rep2_pids; $i++) { if ($pair[0] == $rep2_pids[$i]) { push @locations, "rep2 "; push @new_gp, "$rep2_pids[$i] "; } if ($pair[1] == $rep2_pids[$i]) { push @locations, "rep2 "; push @new_gp, "$rep2_pids[$i] "; } } for (my $i=0; $i<@rep3_pids; $i++) { if ($pair[0] == $rep3_pids[$i]) { push @locations, "rep3 "; push @new_gp, "$rep3_pids[$i] "; } if ($pair[1] == $rep3_pids[$i]) { push @locations, "rep3 "; push @new_gp, "$rep3_pids[$i] "; } } } print "@new_gp\n"; my (@paired_genes1, @paired_genes2); my @pairs2 = @new_gp; print "@pairs2\n"; while (my ($one, $two) = splice(@pairs2,0,2)) { #print "$one ---- $two\n"; push @paired_genes1, "$one "; push @paired_genes2, "$two "; }

Replies are listed 'Best First'.
Re: extracting information from arrays
by Roy Johnson (Monsignor) on Mar 11, 2005 at 12:29 UTC
    So you have three arrays, each of which is a unique list of numbers, and you want to construct @locations as a list of where they came from, in order. You also construct @new_gp, which looks like it should be the pairs list flattened out.

    When you need to look up a mapping, like a number mapped to its source, use a hash.

    use strict; use warnings; # Just generating data here. my @rep1 = (1..5); my @rep2 = (6..10); my @rep3 = (11..15); my @pairs = map {sprintf "%d %d", int(rand(15)+1), int(rand(15)+1)} 1. +.10; # Here's the magic, fairly information-dense. # Make a list of name-array associations, pass it to map # Then go through the members of the array and associate each member # with the name. my %sources = map { my ($name, $ref) = @$_; map {($_ => $name)} @$ref; } ([rep1 => \@rep1], [rep2 => \@rep2], [rep3 => \@rep3]); # Now, when you have a number $n1, its source is $sources{$n1} my (@locations, @new_gp); for (@pairs) { print "$_: "; my ($n1, $n2) = split; push @new_gp, $n1, $n2; push @locations, @sources{$n1, $n2}; print "$n1 came from $sources{$n1}; $n2 came from $sources{$n2}\n" +; }

    Caution: Contents may have been coded under pressure.
Re: extracting information from arrays
by reneeb (Chaplain) on Mar 11, 2005 at 12:22 UTC
    I hope this meet your needs:
    #! /usr/bin/perl use strict; use warnings; use Data::Dumper; # your pairs my @pairs = ([1,2],[3,4],[5,6]); # sources my @array1 = (1,5); my @array2 = (3,6); my @array3 = (2,4); my @locations; for my $pair(@pairs){ my @loc_array; for my $item(@$pair){ my $location; if(grep{$_ == $item}@array1){ $location = "array1"; } elsif(grep{$_ == $item}@array2){ $location = "array2"; } else{ $location = "array3"; } push(@loc_array,$location); } push(@locations,\@loc_array); }
Re: extracting information from arrays
by graff (Chancellor) on Mar 13, 2005 at 04:37 UTC
    The way I'd probably do this is to organize the three source arrays into a hash structure (HoH: hash of hashes), where the top-level hash keys are the names of the three sources, and the lower level hash keys are the values in each source array. That way, looking up the source for a given value is just a matter of checking for the existence of a given hash key.

    Starting with sample data like what was suggested in an earlier reply:

    # some sample data: my @rep1_pids = (1..10); my @rep2_pids = (11..20); my @rep3_pids = (21..30); my @pairs = ( "11\t5", "22\t14", "4\t8", "1\t29" ); # organize sources into HoH (%arrayhash): # create %hashbuilder first, to assign source names to arrays my %hashbuilder = ( rep1 => \@rep1_pids, rep2 => \@rep2_pids, rep3 => \@rep3_pids, ); # for each source, array names will be the primary hash keys, # and the array values will become keys of the secondary hashes: my %arrayhash; for my $src ( keys %hashbuilder ) { $arrayhash{$src}{$_} = 1 for ( @{$hashbuilder{$src}} ); } # now identify sources for the members (items) in each pair: for my $pairstr ( @pairs ) { my $srcs = ''; for my $item ( split( /\t/, $pairstr )) { for my $src ( sort keys %arrayhash ) { if ( $arrayhash{$src}{$item} ) { $srcs .= "$item is from $src; "; } } } print $srcs, "\n"; }
    It wasn't clear from the OP if this produces the format you want, but it should be easy to change it to your preference.

    Actually, there was a lot in the OP code that looked pretty strange and unclear; like, if the array of pairs is a set of strings like "1304\t1509", what do you think you're accomplishing with the first couple lines of the "foreach $pair" loop? And what's the purpose of the "@location" array? (You never seem to use it after pushing stuff into it.) And so on.

    Let me suggest that when you write code, you start by documenting the plan for what the code is supposed to do: what its inputs and outputs should be, and (in terms understandable to an intelligent person) what the steps are that transform the input into the output. Then write the code to satisfy that description. If you have trouble writing the code, consider changing the description and trying again; if it seems too complicated to describe effectively, break it up into chunks that are easier to describe separately (those chunks become subroutines or modules).

    My point is, if you can't describe the steps your code is supposed to be going through, it's likely to end up being a mess.