in reply to Regex for MS Word Special Characters

Could you define what "Microsoft Word's special characters" exactly are, e.g. list them.

You can filter character list with [ ] so if A T Y I would be special characters (which of course there aren't, just as an example), then you could remove them from a string with the following simple regex:

$string =~ s/[ATYI]//g;
If you have the special characters as octal code you could write this like this:
$string =~ s/[\123\124\145]//g;
(The numbers here are just random example numbers).

Please tell me if I missed something or misunderstood you.

Replies are listed 'Best First'.
Re^2: Regex for MS Word Special Characters
by omg_wtf_lol (Initiate) on Apr 21, 2008 at 19:43 UTC
    Yes, I believe that you did misunderstand my question a bit as my question is basically the same as yours. I know that I can use a regular expression just as you mentioned. my problem is using the correct representation for the MS Word characters such as smart quotes. For now I am trying to use hex code to substitute them in combination, but it is not working when I use a 4-digit hex code.