jaiello has asked for the wisdom of the Perl Monks concerning the following question:

This program brute force decodes an RC4 encrypted message. On my Core2Quad 2.4ghz, it processed about 2000 iterations a second. A C programmer has his own similar brute force code that processes about 300,000 iterations a second. He advises that this is a clear example of compiled code over interpreted code. Since I learned most my perl from perlmonks, web sites and books, this is the most optimized code I could manage. It could be that the RC4 algorithm is much slower than the C lib or maybe there is some glaring element of my code that can be optimized. I would appreciate any suggestions and advice and would be really happy to prove my code is the problem and not perl. Thanks!
#!/perl/bin/perl use Crypt::RC4; $data1="C5C444220AD6FB08972792318300CC703814513EE013AF97B94FF9ACF23F9C +8F0E748C04B2E18BB5A1B491BB73E23EC9A4233B9BCDE4854FD03DCBAC4B3E9EA00F5 +BF7A3A119B0FF2E66C9DD96E7F4F0972959082601AA5DD202DFB0 5039CA4FDC280517E244353690C0DE1A"; $foo1=pack('H*',"$data1"); $dt=`/cygwin/bin/date`; chomp $dt; print "$dt Start \n"; $key=A; while ( 1 ){ $SKIP=0; $ct++; $ucrypt=RC4($key,$foo1); $clen = length($ucrypt); @uchar = split(//,$ucrypt); for($i=0; $i<($clen); $i++) { $val = ord($uchar[$i]); if($val>127) { $SKIP=1; last; } } if ( $ct == 10000 ){ $dt=`/cygwin/bin/date`; chomp $dt; print "$dt count $ct key: $key \n"; $ct=0; } if ($SKIP == 0 ){ print "matched using key:$key \n"; print "$ucrypt \n"; } $key++; } exit;

Replies are listed 'Best First'.
Re: RC4 cipher performance
by ikegami (Patriarch) on Nov 15, 2008 at 19:06 UTC

    You shell out to get the date? Change

    $dt=`/cygwin/bin/date`; chomp $dt;
    to
    $dt=localtime;

    Searching for a hi-bit character

    $clen = length($ucrypt); @uchar = split(//,$ucrypt); for($i=0; $i<($clen); $i++) { $val = ord($uchar[$i]); if($val>127) { $SKIP=1; last; } }

    can be done much cleaner.

    while ($ucrypt =~ /(.)/g) { if (ord($1)>127) { $SKIP=1; last; } }

    But it's probably faster with a regexp.

    $SKIP = $ucrypt =~ /[\x80-\xFF]/;

    And then there's the issue of checking the same thing ($val > 127 and $SKIP == 0) in two different locations.

    my $plaintext; my $ct; my $key = 'A'; while ( 1 ) { $plaintext = RC4($key, $ciphertext); if ( $plaintext !~ /[\x80-\xFF]/ ) { print "matched using key:$key \n"; print "$plaintext \n"; } if ( ++$ct == 10000 ) { $dt = localtime; print "[$dt] count: $ct key: $key \n"; $ct = 0; } $key++; }

    I don't expect any of these changes to have a significant on the speed. (Explanation follows.)

    I would appreciate any suggestions and advice and would be really happy to prove my code is the problem and not perl.

    The fact that a scalar can hold a number of data types is a great strength, but it has some effect on performance. Unfortunately, Crypt::RC4 does heavy number crunching, which means a lot of operations and a lot of variable usage, which means a lot of overhead. Another performance cost is the conversion from the C array of char you pass to RC4 to a Perl array of integers, and back to a C array of char.

    Unless PDL is used, heavy number crunching is not where Perl will be at its best. That's why most of the Crypt:: modules are actually written in C. Crypt::RC4 is Perl all the way.

    He advises that this is a clear example of compiled code over interpreted code.

    Perl is compiled, just not into machine code. Even if it was, it wouldn't be any faster. It's not "interpreted vs compiled" that affects the speed, it's the extra baggage needed to support Perl's features.

      Every attempt to use 'last' in the loop failed to achieve my goal of dumping out of the current iteration and going onto the next. So, I went back to an old shell script method that worked. I am evaluating your suggestion in the while loop to see if this would have worked using my method in the for loop.

      I used the for loop to evaluate one character at a time from the decrypted message to I can bomb out sooner and not have to evaluate the entire message. Since I have not yet groc'd your use of the regex, I am not sure it does the same thing though it looks like it.

      Thanks

        Every attempt to use 'last' in the loop failed to achieve my goal of dumping out of the current iteration and going onto the next.

        Really? I don't see why last wouldn't work in the code you posted.

Re: RC4 cipher performance
by zentara (Cardinal) on Nov 15, 2008 at 19:05 UTC
    You might be interested in perl encryptions keys vs. c. C will always be faster than Perl, in computations, because of the extra baggage Perl places on scalars. Probably Assembly will be faster than C. You might want to try an Inline-C approach.

    I'm not really a human, but I play one on earth Remember How Lucky You Are

      I thought your suggestion for Inline C was good and I used it as an excuse to try it for the first time.

      Using the openssl library that came with my linux this solution gets around 190k calcs/sec on my Celeron M 1.3Ghz and finds the key in 6 seconds! I'm guessing cygwin has a similar library installed but I'm not sure what different Config => LIBS or #include would be needed.

      In your face, C! I mean, err... thanks for the help.

        I tried to make this work under windows but failed! Can't get it to work with Active Perl or cygwin.. yet Thanks
Re: RC4 cipher performance
by jwkrahn (Abbot) on Nov 15, 2008 at 19:28 UTC

    1. You don't need a loop to do character matching, and
    2. You don't need to call an external program to get the current date.

    Try something like this:

    #!/perl/bin/perl use warnings; use strict; use Crypt::RC4; my $foo1 = pack 'H*', 'C5C444220AD6FB08972792318300CC703814513EE013AF9 +7B94FF9ACF23F9C8F0E748C04B2E18BB5A1B491BB73E23EC9A4233B9BCDE4854FD03D +CBAC4B3E9EA00F5BF7A3A119B0FF2E66C9DD96E7F4F0972959082601AA5DD202DFB05 +039CA4FDC280517E244353690C0DE1A'; my $dt = localtime; print "$dt Start\n"; my $key = 'A'; my $ct; while ( 1 ) { my $ucrypt = RC4( $key, $foo1 ); if ( ++$ct == 10_000 ) { $dt = localtime; print "$dt count $ct key: $key\n"; $ct = 0; } if ( $ucrypt !~ /[\x80-\xFF]/ ) { print "matched using key:$key\n$ucrypt\n"; } $key++; }
      I realize I did not need to shell for date. I know that I should have used localtime. I was an artifact of the old program I started with.

      I am evaluating you method of using a regex to look at the decrypted string. I used a loop to bomb out as soon as I found a character with the MSB bit set. I am not sure yet if this regex does the same. -- Actually, I just verified that is does work and is slightly faster by 10%.

      This code was about the same performance as mine. I have believed from the start that the RC4 mod will be the bottleneck even thou my code was a bit simple.

      Thanks.
        I used a loop to bomb out as soon as I found a character with the MSB bit set. I am not sure yet if this regex does the same.

        The regular expression matches a single character so as soon as that character is found it will stop searching.

Re: RC4 cipher performance
by Corion (Patriarch) on Nov 15, 2008 at 19:29 UTC

    Not minding that Perl just isn't suited to bit-fiddling tasks like this, I wanted to see the CPU-meter of my new machine go to 100%. I implemented a threaded version of your program, which runs about 3 to 4 times as fast on my 4-CPU machine by processing sets of 10000 keys in every thread.

    The conversion to threaded code was fairly easy by using a queue through which all batches of keys to be tested go through. The main thread produces batches of 10000 keys and stuffs them into the queue, and goes to sleep once it finds that the queue contains more than 4 batches. Each worker thread retrieves a batch and crunches on it.

    In a real program, there would be a second queue for communicating the status/success of every thread back to the main thread.

    #!/perl/bin/perl use strict; use Crypt::RC4; use Time::HiRes; use threads; use Thread::Queue; my $max_threads = 4; my $data1="C5C444220AD6FB08972792318300CC703814513EE013AF97B94FF9ACF23 +F9C8F0E748C04B2E18BB5A1B491BB73E23EC9A4233B9BCDE4854FD03DCBAC4B3E9EA0 +0F5BF7A3A119B0FF2E66C9DD96E7F4F0972959082601AA5DD202DFB0 5039CA4FDC280517E244353690C0DE1A"; my $foo1=pack('H*',$data1); my $dt=time; print "$dt Start \n"; my $batchsize = 10000; my $key="A"; # The message queue through w my $jobs = Thread::Queue->new(); # Create the worker threads for (1..$max_threads) { async { my $ct; while ( 1 ){ my $start = time; my $items = $jobs->dequeue; my $SKIP=0; for my $key (@$items) { my $ucrypt=RC4($key,$foo1); my $clen = length($ucrypt); my @uchar = split(//,$ucrypt); for my $i (0..$#uchar) { my $val = ord($uchar[$i]); if($val>127) { $SKIP=1; last; } } if ($SKIP == 0 ){ print "matched using key:$key \n"; print "$ucrypt \n"; } }; my $end = time; print $end - $start," seconds\n"; printf "%0.2f items per second\n", @$items / ($end - $star +t); } }; } while (1) { my @batch = map { $key++ } 1 .. $batchsize; $jobs->enqueue(\@batch); my $sleep = 1; while ($jobs->pending() > $max_threads) { sleep $sleep; $sleep *= 2; }; }
      This is nice. I made some small modifications. I always wanted to use threads in perl. I have a great application and might reuse this code :)

      One mod was to print the key when the batch was finished. Since L already know the key I an tell if the code works or not. In this case, the key is never incremented and is always A. The map function does not seem to work.

      It was nice to see that each thread processes about 1666 iterations per second. It was not a good idea to shell for date. I actually know better than that. Thanks
Re: RC4 cipher performance
by eye (Chaplain) on Nov 16, 2008 at 06:55 UTC
    Rather than split $ucrypt into bytes, you could unpack it into an array of integers. This would avoid allocating length($ucrypt) strings. If you use an unsigned character template (C), you can continue to compare it to 127.

    You might get even better performance if you use a multibyte template character appropriate to your architecture (perhaps, N or Q). With an appropriate bitmask, you could use a bitwise AND (&) on each value and compare it to 0. This would test 4 or 8 bytes at once.

    Update:

    The basic idea was to replace:

    @uchar = split(//,$ucrypt);
    with:
    my @vals = unpack('C*', $ucrypt);
    The list @vals gets the values of each "ord($uchar$i)". You then test for an element that is greater than 127.

    You potentially get better performance by bundling the tests. Since I have a 32 bit architecture, I'd create a 32 bit bitmask to select the high bits of each byte, unpack 4 bytes at a time, and logically AND the result with the mask:

    my $mask = 0x80808080; ... my @vals = map { $_ & $mask } unpack('N*', $ucrypt);
    Now, the test fails if any element of @vals is not zero. The potential improvement comes from doing one logical AND and one integer comparison per 4 bytes rather than one integer comparison per 1 byte.
      Can you give me an example in my code?

      Thanks