codes error :(

Bio_student has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I've written perl codes below. My wish that this code should give output : how many times each motif match with the sequences ($dna), but $regexp (variable contain multiple motifs) only read the last motif instead of 11 motifs (in subroutine its give 11 motifs, but when I call the function its give only 1 motif). Please help me what's wrong with my scripts. Thanks in advance.

use strict;
use warnings;

my @file_data = ();
my %motif_hash = ();
my $dna;
my $regexp;
my @positions = (); 
my $site;
my @motifs;
my $match;
my $count;

@file_data = get_file_data("promoters.txt");
$dna = extract_sequence(@file_data);
$regexp = parseMOTIF('motifs.txt');
print "Regexp: '$regexp'\n";


$match = match_positions($count);
push(@positions,$match);
print "match : $match";

sub match_positions{
    my($regexp,$dna) = @_;
    my $count= 0;
    my @positions;
    my $regexp;
    my $dna;
    while ($dna =~ /$regexp/g){
        push(@positions,pos($dna)-length($&)+1);
        ++$count;
        }
        return $count;
}

sub parseMOTIF {
    use strict;
    use warnings;
    my @motiffile = ();
    my $name;
    my $site;
    my $regexp;
    my %motif_hash = ();
    my $motiffile = $_[0];


    @motiffile = get_file_data($motiffile);

    foreach $motiffile(@motiffile){
        if($motiffile =~ /(^[A|S]\d+\s+|^[A|S]\d+b\s+)([A-Z]+)(\s+.*)$
+/){
        my $name = $1;
        my $site = $2;
        $regexp = IUB_to_regexp($site);
        $motif_hash{$name} = "$site $regexp\n";
        print "motif : $site\n";
        print "The regexp : $regexp\n";
        }     
           }    
           return $regexp;

}

sub get_file_data{
    my ($filename)=@_;
    use strict;
    use warnings;
    my @filedata=();
    
    unless(open(GET_FILE_DATA, $filename)){
        print STDERR "can't open file $filename\"\n\n";
        exit;
    }
    
    @filedata = <GET_FILE_DATA>;
    close GET_FILE_DATA;
    return @filedata;
}

sub extract_sequence {
     use strict;
     my (@file_data) = @_;
     use warnings;
     my $sequence = "";
     foreach my $line(@file_data) {
     if ($line =~ /^>/){
     next;
     }
     else {
     $sequence .= $line; 
     }
    }
    #$sequence =~ s/\s//g;
    return $sequence;
}

sub IUB_to_regexp{
    my($iub) = @_;
    my $regular_expression ="";
    my %iub2char_class = (
                A =>'A',   
                C =>'C',
                G =>'G',
                T =>'T',
                R =>'GA',
                Y =>'CT',
                M =>'AC',
                K =>'GT',
                S =>'GC',
                W =>'AT',
                B =>'CGT',
                D =>'AGT',
                H =>'ACT',
                V =>'ACG',
                N =>'ACGT',
                );

    for (my $i=0;$i<length($iub);$i++){
    $regular_expression .= $iub2char_class{substr($iub,$i,1)};
    }
    return $regular_expression;
}
[download]

Comment on codes error :( Download Code

Replies are listed 'Best First'.
Re: codes error by bart (Canon) on Dec 10, 2010 at 12:14 UTC
Here's your problem: `foreach $motiffile(@motiffile){ if($motiffile =~ /(^[A\|S]\d+\s+\|^[A\|S]\d+b\s+)([A-Z]+)(\s+.)$ +/){ my $name = $1; my $site = $2; $regexp = IUB_to_regexp($site); $motif_hash{$name} = "$site $regexp\n"; print "motif : $site\n"; print "The regexp : $regexp\n"; } } return $regexp; ...` [download] You start to loop through the array `@motiffile`, but as you get through to the end of this snippet, you return* from the sub with one result, thus leaving the loop. What you can do is collect all results, for example by using this instead of the return statement: `push @regexp, $regexp` [download] (and of course, `@regexp` must be declared higher up in the sub), and after the loop block, return a combined regexp: `return join '\|', @regexp;` [download] update Oops, I misread the code (bad indentation), you return after the loop block, instead of in it... But the end result is nearly the same, except now you return the last $regexp instead of the first. Put my push statement in the loop block.	[reply] [d/l] [select]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: codes error :( by Sinistral (Monsignor) on Dec 10, 2010 at 14:09 UTC
Humm, ACTG, I sense genetic analysis is the reason for writing this code. Whenever I see someone trying to do such things, I want to point out the existence of BioPerl. Head on over to the main BioPerl site to get modules, instruction, and information that should help.	[reply]
Re: codes error :( by Anonymous Monk on Dec 10, 2010 at 10:57 UTC
Please help me what's wrong with my scripts. Well, for starters you did not write them :)	[reply]
Re^2: codes error :( by Bio_student (Novice) on Dec 10, 2010 at 11:01 UTC
Which one?	[reply]