Another Encoding decode query

dominic01 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Another Encoding decode query by soundX (Acolyte) on Feb 08, 2015 at 08:25 UTC
I'm relatively new to Perl so one of the monks may have a better idea but have you tried using Encoding::FixLatin, I've recently used this to fix similar encoding issues.	[reply]
Re: Another Encoding decode query by ikegami (Patriarch) on Feb 08, 2015 at 15:59 UTC
"+Ãƒâ€šÃ‚Â¦" is suppose to represent "ö"? So at least 10 bytes? No one encoding would produce that. Even in UTF-8, ö only takes two bytes. Repeatedly encoding using UTF-8 doesn't result in anything like what you have either. (There would be repetition.)	[reply]
Re^2: Another Encoding decode query by dominic01 (Sexton) on Feb 10, 2015 at 03:10 UTC
Between + and \| there were actually 4 characters. When I pasted the chars in perlmonks, it further added 4 more characters. Hence you noted 10 characters in total.	[reply]
Re: Another Encoding decode query by i5513 (Pilgrim) on Feb 08, 2015 at 10:56 UTC
Hello At your question: "These logs are written by different systems and I do not what what kind of encoding is this? " How did you get such file ? It seems like double encoded. You must ensure that every different system which is writting to such file are using the same encoding (maybe UTF8 is fine in your case) Please tell us if you get the work done. I think that a double encode file is very dificult to decode correctly without doing manual conversions. I would start filtering lines which contains letters not expected (add all the letters that you would expect on your file `chomp; print if (/[^a-zA-Z0-9ÄÖÜäöüß,-_.\s]/);` and then start to change specific words with s command. Always is interesting to know how such file cames to the live ! I had some of them because a wrong play of substitution (wrong iconv calls) Regards	[reply] [d/l]
Re: Another Encoding decode query by Anonymous Monk on Feb 08, 2015 at 09:12 UTC
Its very very easy, get the raw binary data, and Data::Dump::dd-er it Then do the same thing to "Köhler" as you encode it to all available Encode->encodings(q{:all}) When you two ddumperings that match, you've found your encoding, you're an Encode::Detectiv.....	[reply]
Re: Another Encoding decode query by pme (Monsignor) on Feb 08, 2015 at 10:08 UTC
What is your font setting in Notepad++? Have you tried 'Arial Unicode MS'?	[reply]
Re: Another Encoding decode query by Anonymous Monk on Feb 08, 2015 at 09:15 UTC
(shrugs) None of the encoders known to Perl can decode it to anything meaningful. This thing was probably double encoded (perhaps by Notepad++). What does it look like in binary?	[reply]
Re: Another Encoding decode query by dominic01 (Sexton) on Feb 10, 2015 at 03:08 UTC
I agree with many of you. The file is probably double encoded. Also from my analysis, noted that the "cat" or "more" redirection might have further changed the encoding. I am trying different option and I will come back with my findings. Note: When I typed encoding of character "ö", Perlmonks interface further adds 4 more characters. hence it looks odd in my original post.	[reply]
Re: Another Encoding decode query by dominic01 (Sexton) on Feb 10, 2015 at 05:12 UTC
Here is the Hex view "Köhler": "4b c3 b6 68 6c 65 72" "K+Ãƒâ€šÃ‚Â¦hler": "4b 2b c3 83 c2 83 c3 a2 c2 80 c2 9a c3 83 c2 82 c3 82 c2 a6 68 6c 65 72" In another case I noted this. How to decode the following to "é". JosÃƒÂ©:c3 83 c2 83 c3 82 c2 a9 José:c3 a9	[reply]
Re^2: Another Encoding decode query by dominic01 (Sexton) on Feb 10, 2015 at 05:14 UTC
Here is with proper formatting "Köhler": "4b c3 b6 68 6c 65 72" "K+Ãƒâ€šÃ‚Â¦hler": "4b 2b c3 83 c2 83 c3 a2 c2 80 c2 9a c3 83 c2 82 c3 82 c2 a6 68 6c 65 72" JosÃƒÂ©:c3 83 c2 83 c3 82 c2 a9 José:c3 a9	[reply]