j0e has asked for the wisdom of the Perl Monks concerning the following question:

Before I begin please let me state that I am doing this for learning purposes, and am well aware of the weakness of XOR encryption. That said, I am having a hell of a time getting a simple XOR routine to work. Here is the code that I have written so far:
#!/usr/bin/perl use strict; if(@ARGV < 3) { print "Usage: $0 <key> <input file> <output file>\n"; exit(0); } my $key = $ARGV[0]; open(IN, $ARGV[1]) or die "Can't open infile"; open(OUT, ">$ARGV[2]") or die "Can't open outfile"; my $kp = 0; while(<IN>) { for my $i (0..length($_)) { if($kp > length($key)) { $kp = 0; } my $kc = substr($key, $kp++, 1); my $char = substr($_, $i, 1); $char ^= $kc; print OUT $char; } } close(IN); close(OUT);
Seems simple enough, but when I encrypt a file and then attempt to de-crypt it with the same key *some* of the file decrypts correctly, but most of it is garbage. A realativly simple task in C, this has had me beating my head against a wall for some time. I'm not exactly a perl kung-fu master, so if anyone has any insight as to what I might be doing wrong, I'd greatly appreciate hearing it.

Replies are listed 'Best First'.
(tye)Re: xor encrypt-decrypt routine
by tye (Sage) on Jan 21, 2001 at 07:48 UTC

    You should use binmode() on your input and output. This might explain your problem. I'd also not read "a line at a time" since during decryption the line endings will be encrypted. No other errors pop out at me so give that a try.

    You can tell you converted this from C code. Perl will let you xor entire strings so you could make this faster and simpler, something like:

    #!/usr/bin/perl -w use strict; if(@ARGV < 3) { print "Usage: $0 <key> <input file> <output file>\n"; exit(0); } my $key = $ARGV[0]; open(IN, $ARGV[1]) or die "Can't read $ARGV[1]: $!\n"; open(OUT, ">$ARGV[2]") or die "Can't wrote $ARGV[2]: $!\n"; binmode(IN); binmode(OUT); my $in; while( sysread(IN,$in,length($key)) ) { print OUT $in^substr($key,0,length($in)); } close(IN); close(OUT);

    I tested this and it works for me.

    Update: I tested your code under Win98 and the first part of the file decrypted properly and then it fell apart. Looking at the encrypted file I find that the description fell apart at the first character that got encrypted into a newline. So I may have actually diagnosed the problem correctly.

    Also, interestingly enough, when I was in college and worked for their computer department, I helped investigate some cracking that a student was doing. The student had encrypted the source code to their tools using something very much like this. They didn't keep the encrypt/decrypt source code on their account. So working with just the contents of some files, I was able to decrypt their files using fairly simple techniques. So don't consider this simple xoring as other than a very insecure obscuring and not really encrypting.

            - tye (but my friends call me "Tye"
      Could you explain this one to me? I think I am missing something.
      The way I see it, you have a target file which is unknown to you, and a random key which are XOR'd into the encrypted version. Wouldn't Random plus an unknown content (despite being ordered/logical) still be random?
      Did you go by the fact that you knew it was source code you were decrypting so you knew to look for obvious text such as #define, or #include, or main

      If I took a file, and did some alphabet substitutions on it so that it no longer had a language form, and then XOR'd it to random data, would you still be able to decipher/unobscure it?
      thanks!

        Two main problems: First, the "key" is too short and so repeats a ton of times on a moderately large file so you have lots of opportunity to figure out parts of the key one place and use that knowledge a ton of other places. Second, you are xoring ASCII characters so it isn't that hard to recognize patterns.

        If I wanted to do something like this I would:

        • Compress the clear text first.
        • Add a random pre-amble of random length since the compressed clear text will start with a predictable signature (see other threads here on how to get enough randomness).
        • Don't use the key directly. Use a MD5 hash of the key, for example.
        • Use the compressed clear text to modify what you are xoring with as you go.
        But I'm not a professional cryptographer and I strongly suspect that a professional cryptographer would be able to break such a scheme. You are really better off to go with a recognized encryption algorythm.

                - tye (but my friends call me "Tye")
Re: xor encrypt-decrypt routine
by chromatic (Archbishop) on Jan 21, 2001 at 09:08 UTC
    Loop to length($_) - 1. The second argument to substr() is an *offset*. If there are 32 characters in the string, the offsets run from 0 to 31.

    That may not solve the entire issue, because I'm not sure if you get carriage returns when reading in a file on Windows. If you're not on Windows, it doesn't matter.

    Either way, you'll get one good line, then an extra character that throws off your key, and everything else will be garbage after that.

      Ahh, that did the trick. I simply changed the loop to:
      for my $i (0..(length($_) -1)) { ...
      and it worked like a charm. I'm not running Windows so I didn't get bitten by any carrige returns. Thanks!
Re: xor encrypt-decrypt routine
by adamsj (Hermit) on Jan 21, 2001 at 07:23 UTC
    What's happening to your newlines? Do you need a chomp? Or possibly an explicit newline at the end of the loop?

    On second thought, is your array subscript right? Or maybe you should have >= instead of > in the if statement?

    I think I've just disproved Jack Kerouac's old saying, "First thought, best thought."

    Please note: I've diddled this node about five times in the last three minutes.

      The newline chars in the file should be encrypted/decrypted with everything else. Using the >= operator instead of > is probably a good idea, as whilst doing a little "print debugging" I found that there was a NULL or something at the end of the $key variable. That didn't seem to fix it, but oddly enough I seem to get much better results when I use the following loop:
      while(my $char = getc(IN)) { if($kp >= length($key)) { $kp = 0; } my $kc = substr($key, $kp++, 0); $char ^= $kc; print OUT $char; }
      That encrypts/decrypts perfectly, but seems to end abruptly after running through 1k of data. Now I'm even more confused. What's the difference between this code and the code I posted before?

        You probably finally hit a point that encrypts to a CTRL-Z which is end-of-file under Windows unless you use binmode as I suggested below.

                - tye (but my friends call me "Tye")