Re: Create the reverse complement DNA sequence without pattern matching and reverse built-in function?

Is there a way of doing this ..., using only substr function?

As has been shown, you can. But why would you?

tr *ISN'T* a regular expression. It is a compile time constructed table lookup; designed for exactly this purpose; and it will do the job anywhere from ~~100x to 400~~ 78x & 270 times faster than the other methods offered here:

#! perl -slw
use strict;
use Benchmark qw[ cmpthese ];

our $seq = "ACGGGAGGACGGGAAAATTACTACGGCATTAGCacgggaggacgggaaaattactacg
+gcattagc";

our %xref = (
  A => 'T', C => 'G', G => 'C', T => 'A',
  a => 't', c => 'g', g => 'c', t => 'a',
);
our %rc = ( A => q{T}, T => q{A}, C => q{G}, G => q{C}, );

cmpthese -1, {
    tr => q[
        ( my $revcmp = reverse $seq ) =~tr[ACGTacgt][tgcaTGCA];
    ],
    davido => q[
        my $seq = $seq;
        for(my $ix = 0; $ix < length $seq; ++$ix ) {
            substr( $seq, $ix, 1, $xref{ substr( $seq, $ix, 1 ) } );
        }
        my $reverse = reverse($seq);
    ],
    atcroft => q[
        my $complement;
        my @letters = split //, $seq;
        while ( @letters ) {
            my $l = uc pop @letters;
            $complement .= $rc{$l};
        }
    ],
    hdb => q[
        my $rev='';
        my $n = length $seq;
        while( $n-- ){
            $rev .= $_ for map chr( $_ & 2 ? $_^4: $_^21 ), ord substr
+ $seq, $n, 1
        }
    ],
}

__END__
C:\test>junk60
             Rate     hdb atcroft  davido      tr
hdb        9827/s      --    -35%    -71%   -100%
atcroft   15077/s     53%      --    -55%    -99%
davido    33822/s    244%    124%      --    -99%
tr      2679052/s  27163%  17669%   7821%      --
[download]

(I know you guys know this; but does the OP??? And if they do, why are they asking for this?)

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re: Create the reverse complement DNA sequence without pattern matching and reverse built-in function? Download Code

Replies are listed 'Best First'.
Re^2: Create the reverse complement DNA sequence without pattern matching and reverse built-in function? by davido (Cardinal) on Feb 19, 2014 at 17:12 UTC
I tried to alert the OP to this issue in my post: I would expect this to be tremendously slower than using `tr///` on large data sets. And your benchmark confirms it; there's a tremendous difference, in favor of using `tr///` to do what it was designed to do. I think your tr/// method missed reversing the string afterwards, but that won't change the fact that the tr/// approach is the way to go. Dave	[reply] [d/l] [select]
Re^3: Create the reverse complement DNA sequence without pattern matching and reverse built-in function? by BrowserUk (Patriarch) on Feb 19, 2014 at 19:07 UTC
I tried to alert the OP to this issue in my post: Sorry. I missed that. your tr/// method missed reversing the string afterwards, but that won't change the fact that the tr/// approach is the way to go. Right on both counts. And it make's more of a difference than I was expecting, but still 80x faster than the next best: `tr => q[ ( my $revcmp = reverse $seq ) =~ tr[ACGTacgt][tgcaTGCA]; ], C:\test>junk60 Rate hdb atcroft davido tr hdb 9827/s -- -36% -71% -100% atcroft 15292/s 56% -- -55% -99% davido 33822/s 244% 121% -- -99% tr 2640400/s 26770% 17167% 7707% --` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]