If nothing else works, you could perhaps do one of the following two things: (1) preprocess the data to change the characters that create a problem into something else (and post process the data back the other way around afterwards); the
Schwartzian Transform (just "google" these words if you don't know what it is) might be a way to do it; (2) write your custom compare subroutine to replace the default
cmp function. Depending on your real data one or the other solution might be practical or almost impossible; it is also likely to be slower, but if your data is not too large, this might not be a problem.