Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Find element in array

by johngg (Canon)
on Feb 16, 2020 at 15:28 UTC ( [id://11113027]=note: print w/replies, xml ) Need Help??


in reply to Find element in array

Does this do what you want? There is no need to split the sequence into an array as pos will allow you to find where in a string a match has been made. Note that [^ACGT] is a negative character class, i.e. match anything that isn't A, C, G or T. Using capturing parentheses, ( ... ), and matching globally, m{ ... }g or / ... /g will advance along the sequence looking for invalid letters.

I am opening a file that is held inside the script just to keep things tidy on my system but the code will work fine with STDIN. The code.

use 5.026; use warnings; open my $dnaFH, q{<}, \ <<__EOD__ or die $!; TAAGAACAATAAGAACAAGAACAATAA GAACAATAAGXAATAAGAAXXAACAAGAACAATAA ACAATAAAAGAACAATAAGAA __EOD__ while ( my $sequence = <$dnaFH> ) { chomp $sequence; my $length = length $sequence; say qq{Sequence: $sequence -- Length $length}; if ( $sequence =~ m{^[ACGT]+$} ) { say q{ Sequence is GOOD!}; } else { my @badPosns; push @badPosns, pos $sequence while $sequence =~ m{(?x) (?= ( [^ACGT] ) )}g; my $nBad = scalar @badPosns; my $perc = sprintf q{%.2f}, $nBad / $length * 100; say qq{ Sequence is BAD at @badPosns}; say qq{ $nBad bad positions, $perc\% of total}; } } close $dnaFH or die $!;

The output.

Sequence: TAAGAACAATAAGAACAAGAACAATAA -- Length 27 Sequence is GOOD! Sequence: GAACAATAAGXAATAAGAAXXAACAAGAACAATAA -- Length 35 Sequence is BAD at 10 19 20 3 bad positions, 8.57% of total Sequence: ACAATAAAAGAACAATAAGAA -- Length 21 Sequence is GOOD!

I hope this is helpful. Please ask further if you need more help.

Update: There was a mistake in the code, I should have used a look-ahead assertion as without that pos gives the position after the match, not that of the match itself. Added extended syntax ((?x)) to make the regex clearer. My bad :-(

Update 2: I should also have corrected the output, now done.

Cheers,

JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11113027]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (6)
As of 2024-04-18 06:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found