Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hi all, i have to take each line from file 1, and search for that line in file2 and file3 along with its sequences.
file1: string11 seq33 string123 seq334 string12 seq32 file2: >string11 AGCTAGCTG CAGAGTC >string123 AGCTGAAGA >string12 AGCGATTATCGA AGCATGAAGACC ACAGCATGACTA file3: >seq32 AGAAGCTCCTAGCT >seq334 AGCTGAAGA AAAGCTAGA >seq33 AGGATTCGA AAATATGA my program: open(FILE1,"file1.txt"); open(FILE2,"file2.txt") open(FILE3,"file3.txt") @ray1=<FILE2>;$join1=("",@ray1);@list1=split(">",$join1); @ray2=<FILE3>;$join2=("",@ray1);@list2=split(">",$join2); while($s=<FILE1>){ ($one,$two)=split("\s+",$s); @first=grep($one,@ray1); @second=grep($two,@ray2); #<ACTION - CALLING SUBROUTINE> #i should have only 2 sequences here for every loop. }
this program is not fetching only 2 sequences at a time inside the loop, so when i want to pass only 2 sequences(@first and @second) to subroutine, i get mistakes. please give some suggestions.

Replies are listed 'Best First'.
Re: grep the whole element in an array
by perliff (Monk) on Apr 21, 2009 at 08:26 UTC
    just install bioperl and use the module Bio::SeqIO to read in this FASTA format ... like this. Note that I haven't tested it.. but it should work in this fashion.
    use strict; # you do use strict don't you? use Bio::SeqIO; # your file containing the patterns open (INP,"data.txt") || die "cant find it!"; while (my $line = <INP>) { chomp $line; # separate pattern from filename my ($filename,$pattern) = split (/\s+/,$line); #open the file that should be searched my $seqio_object = Bio::SeqIO->new(-file => $filename, -format=>"fas +ta"); #loop through all the fasta sequences while ($seq_object = $seqio_object->next_seq){ my $sequence = $seq_object->seq; my $id = $seq_object->display_id; #check if pattern exists if ($sequence =~ /$pattern/g) { print "$pattern found in $id in file $filename\n"; } } }
    However, on another note, its probably better to post Bioperl questions on the bioperl mailing list. Google for it.

    ----------------------

    "with perl on my side"

Re: grep the whole element in an array
by targetsmart (Curate) on Apr 21, 2009 at 07:31 UTC
    You input data seems to have newline character, but you never did a chomp.

    Vivek
    -- In accordance with the prarabdha of each, the One whose function it is to ordain makes each to act. What will not happen will never happen, whatever effort one may put forth. And what will happen will not fail to happen, however much one may seek to prevent it. This is certain. The part of wisdom therefore is to stay quiet.
Re: grep the whole element in an array
by Gangabass (Vicar) on Apr 21, 2009 at 07:33 UTC

    I think you better parse file2 and file3 into hash of arrays. Something like:

    my $file2 = { string11 => [ "AGCTAGCTG", "CAGAGTC" ], string123 => [ "AGCTGAAGA" ], string12 => [ "AGCGATTATCGA", "AGCATGAAGACC", "ACAGCATGACTA" ], ...... };

    And same for file3. After that you can easly manipulate your data.