in reply to Re: Format eating too few characters with utf-8?
in thread Format eating too few characters with utf-8?
Yes, the data really has to be passed on the command line - this is a several year old and established script that I can't change the semantics of. ikegami's reply populated @ARGV inside the script rather than passing the text on the command line, which should avoid any issues there.
print()ing the string before passing it to decode_utf8 gives "l'été sera chaud". Decoding it gives "l'�t� sera chaud".
I've found a solution that seems to fix the problem, although it is rather hacky. Appending n spaces to the string before write()ing it, where n is the number of non-ASCII characters inside the string, seems to fix the issue without polluting the eventual output.