in reply to Characters Changed To Codes

The overall issue is "Encodings". You need to find out what encoding your input is in, and what encoding your output is in (and what the program you're using to display your output thinks the output is encoded in), and make sure that you properly convert between these, or send the appropriate headers etc. to tell all programs involved about the encoding.

For Perl, the best general approach is to convert to UTF-8 on input, and to convert to the target encoding on output using Encode::decode and ::encode. Ideally, your target encoding also is UTF-8. For example, if you're outputting to HTML, you can tell the browser the encoding in the <!DOCTYPE part of the document.

Replies are listed 'Best First'.
Re^2: Characters Changed To Codes
by HalNineThousand (Beadle) on Jan 23, 2011 at 20:20 UTC

    I had been viewing the input and output files in different viewers and had thought they were reliable. It turns out that was a big mistake (okay, it's obvious now!). I pulled up a hex editor and looked over the codes and found the encoding was not getting messed up, but I was not specifying the encoding in the output file.

    In the past I've only used HTML with my own sites or in specific usage situations, so encoding has never been an issue for me for anything -- obviously, otherwise I would have known the term.

    I see there's a LOT of info out there on encoding, so thanks for suggesting the obvious (that I had overlooked) and for giving me the right term to use for researching this. That's a BIG help!