I tested splitting the strings up into arrays and timed the two methods (simply via 'time script.pl').
The
split-method takes ~13 seconds to finish, the
substr-method only ~7 seconds.
The advantages of having the data ready in arrays doesn't count for me, I just need the percentage of similarity, and as fast as possible. ;-)
I'll try out the rest of the suggested methods here and report which does best.
Thanx, Micha