Re: Compressing a text file using count of continous characters

s/((\D)\2+)/length($1).$2/ge;
[download]

it search for a non-number, then search if it's repeated 1 or more times, then replace it by the count of them followed by the the repeated char
Oha

PS: could be usefull to limit the number of repeated matches, and include the numbers. but in this case numbers must always have a counter:

s/((\D)\2{1,8}|(\d)\3{0,8})/length($1).$2.$3/ge;
[download]

this one will compare 2 or more char, or 1 or more if digit and match no more then a sequence of 9.
in this way you can decode the data with no side-effect if strange data is used: X2AAAAAAAAAAAAAAAAAAAAAAA1111 become X129A9A5A41
A repetitions are grouped up to a max of 9, numbers are counted as repetition also if not repeated.
to decode use:

s/(\d)(.)/$2x$1/ge;
[download]

Oha

Comment on Re: Compressing a text file using count of continous characters Select or Download Code