http://qs1969.pair.com?node_id=468106

This is a question disguised as a meditation disguised as a puzzle. (I think Meditations is the best section for it, but, most esteemed janitors, please feel free to move it to SoPW, or wherever else is more appropriate.)

NB: If you are a split-meister, you may want to cut to the chase, and go straight to The Question below.

The Puzzle

OK, as Perl puzzles go this is not a very hard one, but I think it will still be interesting to those who haven't already seen something like it. Find a simple way to split a string into substrings of length 3, say (the last chunk may be shorter, if the length of the string is not divisible by 3). (For simplicity, assume the string contains no newlines...or trailing 0s [thanks, Smylers].) For example, if the input string is

atgactaatagcagtgg
the output should be the list
0 'atg' 1 'act' 2 'aat' 3 'agc' 4 'agt' 5 'gg'

What trips one in such a puzzle (or at least tripped me) is the word "split" in the posing of it, which leads one immediately to think of Perl's split builtin function. It is possible to use split for this, but the only solution I know of requires a filtering through grep:

@codons = grep $_, split /(.{3})/, 'atgactaatagcagtgg'; print "@codons\n"; __END__ atg act aat agc agt gg
Note that the parens are required in the regex. (If it's not clear why, see split, in particular the role of capture in the regex argument.)

But a simpler solution requires only m//g, without any filtering:

@codons = 'atgactaatagcagtgg' =~ /.{1,3}/g; print "@codons\n"; __END__ atg act aat agc agt gg
Note that parens are not needed in this case, but it is necessary to use the range quantifier {1,3} instead of the "exact" quantifier {3}.

The Question

OK, that was all preamble to my real question, which is, is there a simple regex-based solution to split a string into "runs" of the same character? For example, if the input is 'aaabbcddddaee', then the output should be the list

0 'aaa' 1 'bb' 2 'c' 3 'dddd' 4 'a' 5 'ee'
The best I can come up with is the gangly:
@runs = do { my $i; grep ++$i%2, 'aaabbcddddaee' =~ /((.)\2*)/g }; print "@runs\n"; __END__ aaa bb c dddd a ee
I'd be interested in learning of more elegant solutions.

Update: In response to BrowserUk's question, yes order matters.

Update2: Fixed puzzle's statement, in response to Smylers' observation.

The Other Question

Incidentally, what makes my last solution so awkward is the extraneous machinery to get rid of every other item in the list returned by m//g. Is there a better idiom for selecting (or filtering out) every n-th item from a list (not an array!) of unknown length? (Of course, if an idiom requires hauling in a module, it is automatically somewhat lame, particularly if it's a non-core module.)

the lowliest monk