Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, i have a question! i have a protein string and i want to separate alphabet 3-3.

 $prot="HKTTLDSSRTTTTAABNNRFGHGHGYYH"

i want something like this HKT TLD and so on. i used split function, but i couldn't. please help me!

Replies are listed 'Best First'.
Re: split function
by johngg (Canon) on Sep 10, 2012 at 21:55 UTC

    An alternative to a regular expression solution would be to use unpack.

    $ perl -E 'say for unpack q{(A3)*}, q{HKTTLDSSRTTTTAABNNRFGHGHGYYH};' HKT TLD SSR TTT TAA BNN RFG HGH GYY H $

    Cheers,

    JohnGG

Re: split function
by aaron_baugher (Curate) on Sep 10, 2012 at 21:56 UTC

    One thing I've learned here at PerlMonks is that unpack can be very fast at this kind of thing:

    abaugher@bannor> perl 992848.pl Rate regexit unpackit regexit 133333/s -- -43% unpackit 234432/s 76% -- abaugher@bannor> cat 992848.pl #!/usr/bin/env perl use Modern::Perl; use Benchmark qw(:all); my $p = 'HKTTLDSSRTTTTAABNNRFGHGHGYYH'; cmpthese( 1_000_000, { 'regexit' => \&regexit, 'unpackit' => \&unpackit, }); sub regexit { my @p = $p =~ /.{1,3}/g; } sub unpackit { my @p = unpack '(A3)*', $p; }

    Aaron B.
    Available for small or large Perl jobs; see my home node.

Re: split function
by frozenwithjoy (Priest) on Sep 10, 2012 at 21:09 UTC
    If you want to do something with the amino acid triplets after splitting, you can try something like this to put them into an array:
    #!/usr/bin/env perl use strict; use warnings; use Data::Printer; my $prot="HKTTLDSSRTTTTAABNNRFGHGHGYYH"; my @triplets = $prot =~ /.{1,3}/g; p @triplets;

    The resulting array looks like this:

    [ [0] "HKT", [1] "TLD", [2] "SSR", [3] "TTT", [4] "TAA", [5] "BNN", [6] "RFG", [7] "HGH", [8] "GYY", [9] "H" ]
Re: split function
by hbm (Hermit) on Sep 10, 2012 at 20:38 UTC

    Here is one easy way of many:

    $prot = "HKTTLDSSRTTTTAABNNRFGHGHGYYH"; say for/.{3}/g;

    Says:

    HKT TLD SSR TTT TAA BNN RFG HGH GYY