Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: What is the proper way to read non-ANSI data

by CountZero (Bishop)
on Sep 13, 2015 at 12:12 UTC ( [id://1141820]=note: print w/replies, xml ) Need Help??


in reply to What is the proper way to read non-ANSI data

What do you mean by "is not read correctly"?

How did you find out?

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics
  • Comment on Re: What is the proper way to read non-ANSI data

Replies are listed 'Best First'.
Re^2: What is the proper way to read non-ANSI data
by freonpsandoz (Beadle) on Sep 13, 2015 at 21:03 UTC
    I redirect the output to a utf-8 file and compare it to the redirected output of dumptorrent.exe in Notepad++. The characters display as – instead of – for example.
      Notepad++. The characters display as –

      Notepad++, like all other programs, can only guess the encoding of plain text files. Some other file formats, like HTML, may contain more information about the encoding used. Other file formats are always encoded as UTF-8, like Java sources (IIRC).

      So, Notepad++ may just guess wrong. Check in the status bar which encoding Notepad++ guessed (probably ANSI). Use the Encoding menu to switch (not convert!) the encoding.

      A trick that works quite often is to write a Byte Order Mark ("\x{FEFF}") as first character to any file that is encoded in some Unicode encoding, including UTF-8. It is not strictly required for UTF-8, but helps most programs to guess the encoding right, including Notepad++.

      In most cases, the BOM does not hurt. An exception are any kind of unix scripts that must start with "#!" and not with a BOM. A BOM makes the script unrecognisabe to the kernel, leading to bizarre error messages.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1141820]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2024-04-18 01:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found