golgiapparatus has asked for the wisdom of the Perl Monks concerning the following question:

Hello all, An old developer encoded log files using xor ^ 255. I am running into lots of trouble with reading the string into a variable (from file) utilizing utf-8 encoding. Some of the characters are giving me some problems because of encoding differences (output). I first open the txt file with a perl open file handle, loop threw the string and then xor the chr. Here is a snapshot of my code:
my $pLog = $_; $pLog = encode("utf-8",$pLog); my @strUTF = split(//,$pLog); for(my $i = 0; $i < scalar(@strUTF); $i++){ print chr(ord($strUTF[$i]) ^ 255);
What can I do better here to modify the char ^ 255 for outputting to new text file? Thank you for the help.

Replies are listed 'Best First'.
Re: un obfuscate logfiles with xor
by LanX (Saint) on Jun 25, 2013 at 22:18 UTC
    basically, if your data was encoded on a per byte basis, then it doesn't make much sense treating it as unicode before it was decoded.

    You should transform it back per byte and then treat it as unicode (if necessary)

    And posting your data into HTML doesn't really help!

    try something like

    DB<200> $str=join "","äöüÄÖÜß" DB<201> printf '\\x%X,',$_ for unpack 'C*',$str => "" \xC3,\xA4,\xC3,\xB6,\xC3,\xBC,\xC3,\x84,\xC3,\x96,\xC3,\x9C,\xC3,\x9F,

    or even

    DB<202> unpack 'H*',$str => "c3a4c3b6c3bcc384c396c39cc39f"

    this should already be a strong hint on how to solve your problem ...

    Thank you!

    Cheers Rolf

    ( addicted to the Perl Programming Language)

Re: un obfuscate logfiles with xor
by Laurent_R (Canon) on Jun 25, 2013 at 21:16 UTC

    Hmm, I don't quite understand you problem. If you file is encoded with a xor 255 (or a xor anything, for that matter), and if I understood your problem correctly, you should just need to apply the same xor to retrieve the original value.

    Or did I understand you wrongly?

      Here is an example ÃÎÌËÁßÍÏÎÌÐÏÊÐÎÌßÏËÒÏÇÒÊÆßÓœž“ÏÑ–‘™ÁߑЯÌÇÌÑ‘ŠŒ–šÑœ’ß™ Thank you

        use strict; use warnings; my $s = q<ÃÎÌËÁßÍÏÎÌÐÏÊÐÎÌßÏËÒÏÇÒÊÆßÓœž“ÏÑ–‘™ÁߑЯÌÇÌÑ‘ŠŒ–šÑœ’ß +™>; print pack "C*", map { $_ ^ 255 } unpack "C*", $s;
        <134> 2013/05/13 04-08-59 <local0.info> nu9383.nuspire.com pf

        This only works if I run the code in an ASCII encoded file, there doesn't seem to be utf-8 involded here.

        unpack "C*", $s; returns a list consisting of the value of each byte in $s. pack with "C*" takes a list of bytes and turns it into a string. Read perlpacktut for more information on that.

        Is that an example of raw input, or the output from the current iteration of your script? It would be helpful to know we have a clean fragment of raw logfile data

Re: un obfuscate logfiles with xor
by Eily (Monsignor) on Jun 25, 2013 at 21:24 UTC

    I'm not really sure of that one, maybe an input exemple would help (if you have the other dev's code, you could make a data/encoded pair for us to try) but if he uses utf-8 encoding already, you either don't need to do that again, or you need to use utf-8 decoding.

    You should read about foreach and map on how to work on an array without indexes (that's C-style for loops, with the idea of pointers not far behind). For example : print join "", map { chr((ord $_) ^ 255) } split //, $_; would do pretty much the same as your code, except for the encode part.