in reply to noobie control char removal

Since you're on Windows ("in notepad"), one possibility is that you're working with an MSWord .doc containing 'smart quotes' and the like (special attention to hyphens and apostrophes).

If so, and if you've processed the document through a script which writes the result to a .txt file (or in certain other ways), you'll see "empty boxes" in the processed data in Notepad but the unprocessed document will render with the intended chars when opened in Word.