Perl_girl has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
Quite new to programming, and have a bit of a problem that I can't work out! Basically, I need to go through one array and pull out the entries that match another, have tried all sorts of different match and grep scripts, but just keep going round in circles! I'm sure its all a mess, but any help would be much appreciated!
What I have includes: 1. List of Gene names and array probe names 2. Data table of these same probe names followed by values What I need to end up with is ideally a reproduction of the second table with a column containing the gene name for the line of data. There are multiple data entries for the gene/probe names.
My current attempt variation, which prints out the data table again with an added column of tabs:
#! usr/bin/perl use strict; use warnings; open (NAME, "found.txt") or die; open (DATA, "Array_values.txt") or die "$!\n"; open (OUT, ">Array_fin.txt") or die "$!\n"; my @line = <DATA>; my @probe_ID; my @probe_name; my @gene; my $line; my $probe_name; my $i = (0 .. 13005); #The number of genes and probe names (whereas th +ere are 389308 data entries, including header line) while (<NAME>) { my @col = split(/\t/,$_); #split sequence names into gene and probe $probe_name = $col[1]; my $gene = $col[0]; push (@probe_name, $probe_name); push (@gene, $gene); } foreach $line (@line) #take each line of data { my @data = split(/\t/,$line); #extract probe name my $probe_ID = $data[0]; # push (@probe_ID, $probe_ID); if ($probe_ID =~ m/$probe_name[$i]/) #match the current entry w +ith the list of probe names { print OUT "$gene[$i]\t$line\n"; #print the relavant gene name a +nd data } else { print OUT "no match\n"; } } close
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Match within array
by moritz (Cardinal) on Jul 06, 2010 at 11:59 UTC | |
|
Re: Match within array
by ww (Archbishop) on Jul 06, 2010 at 12:59 UTC | |
|
Re: Match within array
by RMGir (Prior) on Jul 06, 2010 at 11:51 UTC | |
by moritz (Cardinal) on Jul 06, 2010 at 12:03 UTC | |
by RMGir (Prior) on Jul 07, 2010 at 11:39 UTC |