Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I ran across this string in a document:

\xE2\x80\x94
I did tr/\x94/"/, and got no result. (Should have replaced \x94)
What is the above code/string and how do I get rid of it?

Replies are listed 'Best First'.
Re: Getting rid of \x junk
by hippo (Archbishop) on Jan 28, 2021 at 22:42 UTC
Re \x {} junk
by Anonymous Monk on Jan 30, 2021 at 14:24 UTC

    I just posted this 2d ago:

    " I ran across this string in a document: \xE2\x80\x94 I did tr/\x94/"/, and got no result. (Should have replaced \x94). What is the above code/string and how do I get rid of it? "
    As long as word-processors keep getting "made better" this Q and situation will keep coming up.
    As I said, I tried tr/\x94/"/ -- and got no result. THEN what does "\xE2\x80\x94" mean?
    Has anyone written a program that removes all occurances of "\x*" from large documents?

    Is there a document that tells how to fix/convert this?

      I think you might be looking for the solution others provided me just recently for almost the exact same issue. In my case, the trick turned out to be a two-step process. You will find the details to the solution here:

      Perl's encoding versus UTF8 octets

      The crux of it was this:

      $mytext =~ s!\\x(..)!chr(hex($1))!ge; my $newcode = decode('utf8', $mytext);

      Note the substitution followed by the decode.

      Blessings,

      ~Polyglot~