Thanks Dave, very helpful.
You're right, my terminal can't display char 151. It looks like my best (err... easiest) workaround is to write a regex to replace all em dashes with & #8212; Here's what I hacked together:
note: there shouldn't be space between & and #, but I added it so it would display correctly.
are some more encodings if anyone is looking to translate any other odd characters.