in reply to Create the reverse complement DNA sequence without pattern matching and reverse built-in function?

Is there a way of doing this ..., using only substr function?

As has been shown, you can. But why would you?

tr *ISN'T* a regular expression. It is a compile time constructed table lookup; designed for exactly this purpose; and it will do the job anywhere from 100x to 400 78x & 270 times faster than the other methods offered here:

#! perl -slw use strict; use Benchmark qw[ cmpthese ]; our $seq = "ACGGGAGGACGGGAAAATTACTACGGCATTAGCacgggaggacgggaaaattactacg +gcattagc"; our %xref = ( A => 'T', C => 'G', G => 'C', T => 'A', a => 't', c => 'g', g => 'c', t => 'a', ); our %rc = ( A => q{T}, T => q{A}, C => q{G}, G => q{C}, ); cmpthese -1, { tr => q[ ( my $revcmp = reverse $seq ) =~tr[ACGTacgt][tgcaTGCA]; ], davido => q[ my $seq = $seq; for(my $ix = 0; $ix < length $seq; ++$ix ) { substr( $seq, $ix, 1, $xref{ substr( $seq, $ix, 1 ) } ); } my $reverse = reverse($seq); ], atcroft => q[ my $complement; my @letters = split //, $seq; while ( @letters ) { my $l = uc pop @letters; $complement .= $rc{$l}; } ], hdb => q[ my $rev=''; my $n = length $seq; while( $n-- ){ $rev .= $_ for map chr( $_ & 2 ? $_^4: $_^21 ), ord substr + $seq, $n, 1 } ], } __END__ C:\test>junk60 Rate hdb atcroft davido tr hdb 9827/s -- -35% -71% -100% atcroft 15077/s 53% -- -55% -99% davido 33822/s 244% 124% -- -99% tr 2679052/s 27163% 17669% 7821% --

(I know you guys know this; but does the OP??? And if they do, why are they asking for this?)


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Create the reverse complement DNA sequence without pattern matching and reverse built-in function?
  • Download Code

Replies are listed 'Best First'.
Re^2: Create the reverse complement DNA sequence without pattern matching and reverse built-in function?
by davido (Cardinal) on Feb 19, 2014 at 17:12 UTC

    I tried to alert the OP to this issue in my post:

    I would expect this to be tremendously slower than using tr/// on large data sets.

    And your benchmark confirms it; there's a tremendous difference, in favor of using tr/// to do what it was designed to do. I think your tr/// method missed reversing the string afterwards, but that won't change the fact that the tr/// approach is the way to go.


    Dave

      I tried to alert the OP to this issue in my post:

      Sorry. I missed that.

      your tr/// method missed reversing the string afterwards, but that won't change the fact that the tr/// approach is the way to go.

      Right on both counts. And it make's more of a difference than I was expecting, but still 80x faster than the next best:

      tr => q[ ( my $revcmp = reverse $seq ) =~ tr[ACGTacgt][tgcaTGCA]; ], C:\test>junk60 Rate hdb atcroft davido tr hdb 9827/s -- -36% -71% -100% atcroft 15292/s 56% -- -55% -99% davido 33822/s 244% 121% -- -99% tr 2640400/s 26770% 17167% 7707% --

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.