in reply to Re: What's the 'M-' characters and how to filter/correct them?
in thread What's the 'M-' characters and how to filter/correct them?

binmode STDIN, ':utf8';

See Encode, as well as perlunitut and perluniintro.

But the data shown doesn't seem to be unicode. If it was, this

DepM-ssito Centralizado

would instead be

DepM-CM-3sito Centralizado

So, the data is some ISO-8859 variant. In ISO-8859 the ó is chr(243), which is chr(ord ('s') | 128) (hence the output as M-s) and the character with the high bit set in

London andM- NewYork

is most likely chr(160), i.e. a non-breaking space - chr(ord (' ') | 128).

perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'

Replies are listed 'Best First'.
Re^3: What's the 'M-' characters and how to filter/correct them?
by 1nickt (Canon) on Jan 19, 2016 at 10:34 UTC

    Indeed. Thanks for clarifying that. I didn't mean to say the OPs data was UTF-8; it was just an example using a common encoding.

    The way forward always starts with a minimal test.