You said: "I know which one I'd rather maintain,"; I asked: "I'm not sure why you think it would be any easier to maintain that the non-regex solutions?"; and you've explained in terms of: your preferences, assumptions, understanding, fears, preferences, views, thoughts, skill-set & bad memories.
There is no way to argue with any of that. I cannot tell you what you should prefer, feel, think. etc.
The only vaguely arguable points are:
Why would the code be updated.
Why would anyone modify it, when it performs the required task as is.
It isn't a "text processing task" per se. If you look at the source data for the third test of my performance tests, you'll see that the source data is actually very large bitstrings. The task is to reduce storage requirements by run-length encoding leading repeating sequences. This will not change.
I added a couple of print statements and it told me everything I needed to know:
hdb => sub { my $input = shift; my $length = length $$input; my $i = 0; my $possible; my $j; while( 1 ) { $possible = substr $$input, 0, ++$i; print "i: $i : l:", length $possible; $possible = substr $$input, 0, $i=$j if ($j = index $$input +, $possible, $i) > 0 ; print "i:$i j:$j : l:", length $possible; return $possible if substr( $$input, $i ) eq substr($$input +, 0, $length - $i); } },
I actually print the value of $possible rather than the length, but that would be inappropriate to post.
The output from the last, longest test data shows why it is so fast:
In reply to Re^5: Finding repeat sequences.
by BrowserUk
in thread Finding repeat sequences.
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |