in reply to Re: regex: how to negate a set of character ranges?
in thread regex: how to negate a set of character ranges?

Thanks for the speedy reply! I've kept both the single and multibyte ranges on separate lines just to help me keep track of what they actually represent.

However, I do not think it is possible to combine the multibyte characters, which means that I can't quite combine everything.

The missing slash was a typo.

Also, I should point out that,
s/[${shiftjis}]//ogx; (or s/${shiftjis}//ogx; or s/$shiftjis//ogx;) will work as expected.

What doesn't work as expected is:
s/[^${shiftjis}]//ogx;

Unfortunately I'm now at home and do not have access to the text. However, I think that the problem is that I don't know this little corner of the regex syntax...

Replies are listed 'Best First'.
Re^3: regex: how to negate a set of character ranges?
by dynamo (Chaplain) on May 02, 2007 at 04:07 UTC
    Is this still not working as expected when you combine even just a couple of the ranges? I don't think that using multiple ranges and the bitwise or (|) op is doing what you want once it's expanded inside of the char class brackets. Unless performance is a really big problem, if you can't combine the classes for whatever reason, or don't want to, try storing each class string in an array and go through it running the substitution once per char class on the source text. It'll get the job done. Good luck!