Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^3: What is the proper way to read non-ANSI data

by afoken (Chancellor)
on Sep 14, 2015 at 19:42 UTC ( [id://1141967]=note: print w/replies, xml ) Need Help??


in reply to Re^2: What is the proper way to read non-ANSI data
in thread What is the proper way to read non-ANSI data

Notepad++. The characters display as –

Notepad++, like all other programs, can only guess the encoding of plain text files. Some other file formats, like HTML, may contain more information about the encoding used. Other file formats are always encoded as UTF-8, like Java sources (IIRC).

So, Notepad++ may just guess wrong. Check in the status bar which encoding Notepad++ guessed (probably ANSI). Use the Encoding menu to switch (not convert!) the encoding.

A trick that works quite often is to write a Byte Order Mark ("\x{FEFF}") as first character to any file that is encoded in some Unicode encoding, including UTF-8. It is not strictly required for UTF-8, but helps most programs to guess the encoding right, including Notepad++.

In most cases, the BOM does not hurt. An exception are any kind of unix scripts that must start with "#!" and not with a BOM. A BOM makes the script unrecognisabe to the kernel, leading to bizarre error messages.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
  • Comment on Re^3: What is the proper way to read non-ANSI data

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1141967]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-25 05:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found