in reply to Re: Regex KungFu help needed
in thread Regex KungFu help needed

You can put the whole term in the look-ahead to make things a bit simpler and you could take advantage of the $scalar = () = $string =~ m{$pattern}g; idiom rather than successive incrementing, wrapping the whole thing in a map.

$ perl -Mstrict -wle ' > my $seq = q{GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA}; > my @pats = qw{ AAAAA GGGGG GGAGA GAAGG }; > my @cts = map { > my $re = qr{(?=\Q$_\E)}; > my $ct = () = $seq =~ m{$re}g; > } @pats; > print qq{@cts};' 11 3 1 1 $

I hope this is of interest.

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^3: Regex KungFu help needed
by AnomalousMonk (Archbishop) on Oct 02, 2009 at 22:59 UTC
    As a further step, associating patterns with their counts and (cached) regex objects in a hash may be worthwhile:
    >perl -wMstrict -le "my $sequence = 'GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA'; my %patterns = map { $_ => { count => 0, regex => qr{ (?= \Q$_\E) }xms } } qw(AAAAA GGGGG GGAGA GAAGG) ; $patterns{$_}{count} =()= $sequence =~ m{ $patterns{$_}{regex} }xmsg for keys %patterns; print qq{$_: $patterns{$_}{count}} for sort keys %patterns; " AAAAA: 11 GAAGG: 1 GGAGA: 1 GGGGG: 3
    or
    >perl -wMstrict -le "my $sequence = 'GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA'; my %patterns = map { $_ => { count => 0, regex => qr{ (?= \Q$_\E) }xms } } qw(AAAAA GGGGG GGAGA GAAGG) ; $_->{count} =()= $sequence =~ m{ $_->{regex} }xmsg for values %patterns; print qq{$_: $patterns{$_}{count}} for sort keys %patterns; " AAAAA: 11 GAAGG: 1 GGAGA: 1 GGGGG: 3

      Nice, ++

      If you don't need to refer to the pattern again you can reduce that to a single step with the uncompiled pattern as key and count as value.

      $ perl -Mstrict -wle ' > my $seq = q{GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA}; > my %pats = map { > $_, scalar( () = $seq =~ m{(?=\Q$_\E)}g ); > } qw{ AAAAA GGGGG GGAGA GAAGG }; > print qq{$_: $pats{ $_ }} for sort keys %pats;' AAAAA: 11 GAAGG: 1 GGAGA: 1 GGGGG: 3 $

      I expect some Monks could golf that down to seven bytes and a nybble :-)

      Cheers,

      JohnGG

      Update: Extra parentheses inside the map were un-necessary, removed!

        As you wish :)

        print "$_: ", $t = () = GGGGGGGAGAAAAAAAAAAAAAAAGAAGGA =~ /(?=\Q$_\E)/ +g, $/ for AAAAA,GAAGG,GGAGA,GGGGG

        Removing whitespaces is left as an exercise for the reader :)