in reply to Reverse Complement

Here's a variation on rjt's approach that uses a different method to compare the reversed, complemented strings, and then flags locations of non-reverse-complementarity:

>perl -wMstrict -le "die 'Please enter exactly two sequences' unless @ARGV == 2; ;; my ($seq1, $seq2) = map uc, @ARGV; die 'Please enter only ATCG sequences' if grep /[^ATCG]/, $seq1, $seq +2; ;; die 'Sequence lengths differ' if length $seq1 != length $seq2; ;; (my $comp = $seq2) =~ tr/ATCG/TAGC/; my $mask = $seq1 ^ reverse $comp; ;; if ($mask !~ m{ [^\000] }xms) { print 'The sequences are reverse-complementary'; exit; } ;; $mask =~ tr{\000-\377}{_^}; print 'Sequences are non-reverse-complementary at these locations:'; print qq{@ARGV}; print qq{$mask }, ''.reverse $mask; " GGGGaaaaaaCatttatatat atatataaatttttATtcccc Sequences are non-reverse-complementary at these locations: GGGGaaaaaaCatttatatat atatataaatttttATtcccc ______^___^__________ __________^___^______

Replies are listed 'Best First'.
Re^2: Reverse Complement
by stamp1982 (Novice) on Jun 30, 2013 at 20:54 UTC
    Do you have to use STDIN in this case or do I have to store it a file or just run the code as it is? How did you get your outputs? I run the code and it did not allow any user input

    please two sequences as arguments on the command line. Press any key to continue

    what am I missing and why cant I run the code?

    Stamp1982

      As posted, the code is a cut-and-paste of a command line session. It is intended to show source code and invocation switches, pragmata, etc., command line input (to @ARGV), and command line output. The source code is everything between the pair of  " (double-quotes). (Note that these are the only double-quotes in the posted code section.)

      I don't know what you are doing to run the code, but try this:

      1. Download the posted code section using the  [download] link just after it.
      2. Edit out the first " and everything before it, and also the last " and everything after it. What remains should be the entire source code.
      3. Save the edited file.
      4. Invoke the edited file with the samealmost the same command line invocation used originally (Update: do not include the  -e switch as I originally wrote below!) and with the sample input you want, e.g.:
            perl -wMstrict -l  your_file.pl  GGGGaaaaaaCatttatatat  atatataaatttttATtcccc
      5. Examine output. Experiment with various different input data. Play with variations on the source code. Enjoy!

      In fact, here's the edited section of code that should be all ready to go, just use the  [download] link and save to a file of your choice:

      die 'Please enter exactly two sequences' unless @ARGV == 2; ;; my ($seq1, $seq2) = map uc, @ARGV; die 'Please enter only ATCG sequences' if grep /[^ATCG]/, $seq1, $seq2 +; ;; die 'Sequence lengths differ' if length $seq1 != length $seq2; ;; (my $comp = $seq2) =~ tr/ATCG/TAGC/; my $mask = $seq1 ^ reverse $comp; ;; if ($mask !~ m{ [^\000] }xms) { print 'The sequences are reverse-complementary'; exit; } ;; $mask =~ tr{\000-\377}{_^}; print 'Sequences are non-reverse-complementary at these locations:'; print qq{@ARGV}; print qq{$mask }, ''.reverse $mask;

      Update: I just downloaded and saved the code block immediately above in this reply and went through the steps listed above (except it doesn't have to be edited), and it works. Be sure not to use the  -e switch when invoking a file; it is used for command-line source code statements. HTH.

        So how can I get this code to read from a .txt file? Because it is asking me to enter a sequence at line 1
        Can you please explain this in your comment?

        and with the sample input you want, e.g.: perl -wMstrict -l your_file.pl GGGGaaaaaaCatttatatat atatataaatttttATtcccc

        My apologies for being so slow. At the moment this is what I have.
        perl -wMstrict -l your_file.pl GGGGaaaaaaCatttatatat atatataaattttt +ATtcccc die "Please enter two sequences as arguments on the command line\n" unless @ARGV == 2; my ($orig, $comp) = map { uc } @ARGV; die "Please enter only ATCG sequences\n" if grep { /[^ATCG]/ } $orig, +$comp; die "Sorry, the two sequences you have just entered are of different l +engths." . "\nPlease try again on the command line.\n" if length $orig != length $comp; # $comp will transform back to $orig iff it is the reverse complement $comp =~ y/ATCG/TAGC/; if ($orig eq reverse $comp) { print "Yes, the two sequences are reverse-complement of each other +.\n"; exit; } else { die "Unfortunately, the two sequences are not reverse-complement.\ +n"; }

        comments I get are : unquoted strings may clash with future reserved words and compilation error and syntax error at wMstrict -l