http://qs1969.pair.com?node_id=213019


in reply to Efficient run determination.

Its all about using the right language for the job.

@res = p_process(' aaaa bbbbccccccccbbb aaaaabbbbbcddddddddddddd +ddddddd'); print Dumper(@res); use Inline C => <<'END_OF_C_CODE'; void p_process(char *s) { char prev = 'Q'; long count = 0; long pos = 0; long i=0; AV *array; Inline_Stack_Vars; Inline_Stack_Reset; while(*s != 0) { if (count==0) { pos = i; prev = *s; count = 1; } else if (prev == *s) { count++; } else { array = newAV(); av_push(array,newSVpvn(&prev,1)); av_push(array,newSViv(pos)); av_push(array,newSViv(count)); Inline_Stack_Push(newRV_inc(array)); pos=i; prev = *s; count=1; } i++; s++; } Inline_Stack_Done; } END_OF_C_CODE

(Note, if you have a very long string, you should look at shifting things into a top-level array instead of onto the stack directly, since the stack might run out of room)

This one doesn't handle char 0, since that is the C string termination character. A fairly trivial modification to using the SV as it comes in would fix that however.