# of two or more whitespace to a single space, do this instead:
s/\s+/ /g; # in utf8 strings, \s matches non-breaking space
I read this on a webpage somewhere, but for one reason or another, it did not produce the desired results. The binmode utf8 thing did not work either. Though more unpredictable, and for reasons I cannot completely explain, the byte mode solution was the only one I could get to produce the desired results.