G'day Sofie,

Welcome to the Monastery.

"I am trying to check if an input DNA sequence only contains nucleotides."

That's a good start: you've succinctly stated your main goal.

"And if it doesn't I want to print out the position in the sequence where an invalid character was entered."

Excellent: you a have a subtask; also succinctly stated.

"From title: Find element in array"

In my opinion, this is where you started to go wrong. You decided that you needed to split the entire sequence into individual characters and assign those to an array; then go back and iterate the entire array checking each individual character. DNA sequences can be exceptionally long — you may be well aware of this — and doing all this extra work is completely unnecesssary for your stated goals.

Here's a script that does what you want. I've had to make some guesses about the output as you didn't specify that.

#!/usr/bin/env perl use strict; use warnings; my $DNA = <STDIN>; chomp($DNA); my $lengthseq = length $DNA; print "The length of the sequence is: $lengthseq\n"; my (@nucleotideDNA, @nonvalid); for my $pos (0 .. $lengthseq - 1) { my $nucleotide = substr $DNA, $pos, 1; if ($nucleotide =~ /^[ACGT]$/) { push @nucleotideDNA, $pos+1 . ":\t$nucleotide"; } else { push @nonvalid, $pos+1 . ":\t$nucleotide"; } } print "*** nucleotideDNA ***\n"; print "$_\n" for @nucleotideDNA; print "*** nonvalid ***\n"; print "$_\n" for @nonvalid;

Here's a sample run:

$ ./pm_11113020_parse_dna.pl XACGTYTGCAZ The length of the sequence is: 11 *** nucleotideDNA *** 2: A 3: C 4: G 5: T 7: T 8: G 9: C 10: A *** nonvalid *** 1: X 6: Y 11: Z

You may have noticed that I've structured my code in a similar way to yours. Let's look at the differences.

"... I am very new to perl ..."

That's fine, we all started knowing nothing about Perl. Note that Perl is the language and perl is the program.

I recommend you read through "perlintro" and bookmark that page. There's no need to try and learn it all in one sitting; just get a general feel for what it has to offer. It is peppered with links to FAQs, tutorials and more detailed information. Refer back to it whenever the need arises.

Finally, in case you had some genuine, but unstated, reason to use an array, you could have iterated it like this:

for my $pos (0 .. $#DNA) { ... }

Then accessed each element with $DNA[$pos] and reported the position with $pos+1 as I did.

Using the range operator (..) is a standard way to do this: see "perlop: Range Operators" for details.

I don't think that's what you wanted, or needed, here. You've at least learned how to do this in a more appropriate scenario at some other time.

— Ken


In reply to Re: Find element in array by kcott
in thread Find element in array by Sofie

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.