Rashmun has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, i have a small question related to regex's.
$string = "atccatccctttaat"; @triplets0 = $line =~ /(...)/g;
Now @triplets contains the elements: atc cat ccc ttt aat But suppose i omit the global (g) qualifier/modifier:
$string = "atccatccctttaat"; @triplets0 = $line=~ /(...)/;
Now, I was expecting @triplets to consist of one element containing atc, but in fact it consists of one element containing 1. Could some kindly monk please explain why this is the case. -Rashmun

Replies are listed 'Best First'.
Re: regex question
by izut (Chaplain) on Feb 07, 2006 at 12:39 UTC
    Here it works as expected. Well, I had to change $line to $string, but it displays 'atc' when I print @triplets0. What Perl's version are you using?

    Igor 'izut' Sutton
    your code, your rules.

      my apologies for the trouble. It works for me as well now. -Rashmun
      Well, it should show 'atc' because the regex performs only 1 match, and 'atc' is the first match for this string.
      what else did you expect ?

      Enjoy,
      Mickey

Re: regex question
by xdg (Monsignor) on Feb 07, 2006 at 12:33 UTC

    Without the "g", it becomes a boolean function, returning true or false. See perlretut for details and examples.

    Read it too fast -- you are grouping properly, but I think you've got a typo. How about replacing $line with $string?

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: regex question
by ikegami (Patriarch) on Feb 07, 2006 at 15:02 UTC
    When it didn't work, you were using
    $string = "atccatccctttaat"; $triplets0 = $line=~ /(...)/;
    Regexps return true/false in scalar context. Fixes:
    $string = "atccatccctttaat"; ($triplets0) = $line=~ /(...)/;
    or
    $string = "atccatccctttaat"; @triplets0 = $line=~ /(...)/;
Re: regex question
by robsv (Curate) on Feb 07, 2006 at 16:08 UTC
    Rashmun;
    Splitting a string of DNA into triplets... Out of curiosity, do you plan on translating this into protein? If not, ignore me {grin}. If so, have you looked into BioPerl?
    use Bio::Seq; my $seq = Bio::Seq->new(-seq => 'atccatccctttaat', -display_id => 'Sequence1', -alphabet => 'dna'); my $prot = $seq->translate; print 'Translated sequence is: ',$prot->seq,"\n";

    - robsv
Re: regex question
by wulvrine (Friar) on Feb 07, 2006 at 12:48 UTC
    Similarly, after changing $line to $string I also show @triplets0 containing 1 element of "atc".
Re: regex question
by GrandFather (Saint) on Feb 07, 2006 at 19:24 UTC

    You should actually write a stand alone code snippet that shows the problem. With your code as posted:

    $string = "atccatccctttaat"; @triplets0 = $line=~ /(...)/;

    we can't tell what result might be expected because we don't know what the content of $line is. We could assume that you intended $string, but that wouldn't give the results you describe. Start with the following code and recreate the problem you saw, then post that code:

    use strict; use warnings; use Data::Dumper; my $string = "atccatccctttaat"; my @triplets = $string =~ /(...)/; print Dumper (\@triplets);

    which prints:

    $VAR1 = [ 'atc' ];

    as expected


    DWIM is Perl's answer to Gödel