if there are no bytes with the 8th bit set then
   there's no problem -- nevermind
else
   if ( any bytes match /[\xc0\xc1\xc4-\xff]/, or
        an odd number of bytes match /[\x80-\xff]/ ) then
      it must be Latin1
   else
      make a copy
      delete everything that could be utf8 forms of Latin1 characters:
      s/\xc2[\xa0-\xbf]|\xc3[\x80-\xbf]//g;
      if this removes all bytes with 8th-bit set, then
          the original data is almost certainly utf8
      else
          the original data is definitely Latin1