in reply to Can you write a faster code to perform this task?

I can't benchmark right now, but I would think that the tr/// operator is likely to be faster than a regex. Having said that, 80,000 lines is not a very large input, I am not sure that there is really a need to optimize further what you have and you're going, at best, to shave off a split second.

Replies are listed 'Best First'.
Re^2: Can you write a faster code to perform this task?
by LanX (Saint) on Sep 29, 2014 at 08:17 UTC
    tr counts single letters while the OP wants to count sequences of identical letters.

    Cheers Rolf

    (addicted to the Perl Programming Language and ☆☆☆☆ :)

      What about something like this?
      use 5.014; say 'iiiiiiiiMMMMMMMMMMMooooooooooooMMMMMMMMMMiiiiiMMMMMMMMoooo' =~ tr/M/~/sr=~tr/~//; # => prints: 3
      Update: just to be sure, it would be safer to replace 'M' with something much exotic, like '\0', which should be OK, assuming that you're reading text files. (tr/M/\0/sr =~ tr/\0//)

        I tried it with replacing 'M' with 'M' and it worked. No need to worry about exotic characters.

        use 5.014; say 'iiiiiiiiMMMMMMMMMMMooooooooooooMMMMMMMMMMiiiiiMMMMMMMMoooo' =~ tr/M/M/sr=~tr/M//; # => prints: 3
        Well TIMTOWTDI with two chained tr is not necessarily fast. :)

        But who knows, try to benchmark... :)

        Cheers Rolf

        (addicted to the Perl Programming Language and ☆☆☆☆ :)

      True, Rolf, I mis-read the OP's requirement. I thought he wanted the number of "M". But, OTOH, trizen has found a way to use tr/// which seems to be very fast.