in reply to Removing Poly-ATCG from and Array of Strings


The following should does the same as yours... I don't know if it is faster... You could test this with the benchmark module:
use strict; use warnings; my @set = qw (AAAAAT ATCGAT TTTTTG GCCCCC GTGGGG); my $lim = 0.75; my @sel = remove_poly( \@set, $lim); print "BEFORE:",scalar(@set),"\n"; print "AFTER:",scalar(@sel),"\n"; sub remove_poly { my ($array,$lim) = @_; my $len = length $array->[0]; my @sel_array; @sel_array = grep { ( ((tr/A//)/$len < $lim) && ((tr/T//)/$len < $lim) && ((tr/C//)/$len < $lim) && ((tr/G//)/$len < $lim) ) and $_ } @$array +; return @sel_array; }
Now since this is a lot smaller, you could even take it out of the sub..

.:| If it can't be fixed .. Don't break it |:.

Replies are listed 'Best First'.
Re^2: Removing Poly-ATCG from and Array of Strings
by aukjan (Friar) on Jun 09, 2005 at 12:59 UTC
    Another way is to use map on the original array and set all the entries which you don't want to '', and filter those out lateron..
    my @set = qw (AAAAAT ATCGAT TTTTTG GCCCCC GTGGGG); my $lim = 0.75; my $len = length $set[0]; print "BEFORE:",scalar(@set),"\n"; map { ( ((tr/A//)/$len < $lim) && ((tr/T//)/$len < $lim) && ((tr/C//)/$len < $lim) && ((tr/G//)/$len < $lim) ) or $_ = '' } @set; print "AFTER:",scalar(@set),"\n";
    Now the @set contains:
    $VAR1 = ''; $VAR2 = 'ATCGAT'; $VAR3 = ''; $VAR4 = ''; $VAR5 = '';

    .:| If it can't be fixed .. Don't break it |:.