in reply to replacing special characters in file
It depends on exactly what you want to do, but if you just want to strip out certain characters, you can do:
s/[^\w]//g; # strip everything but 'word' characters
s/[^[:ascii:]]//g; # strip everything but ASCII characters
If you want to specifically substitute certain character (sequences), you can do this using hex escapes in the regex, if you can't type them directly in your text editor:
s/\x{00A1}\x{00DC}/st/g; # replace upside-down-bang capital-u-umlaut with 'st'.
You can look up the (Unicode) hex values for capital-u-umlaut and friends in Unibook.
Bear in mind that the text you are editing may not be encoded in Unicode, and that even if it is, some characters may display differently in a terminal (particularly a DOS box) compared to how they will in a text file. Welcome to the inconsistent mess of character encoding standards.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: replacing special characters in file
by joemaniaci (Sexton) on Jun 01, 2012 at 18:08 UTC |