bkiahg has asked for the wisdom of the Perl Monks concerning the following question:

Hello Wise Monks,

I once again come in search of your infinite knowledge. I am completely lost. This is a little beyond my expertise in encryption. I am trying to rewrite a small c# encryption function in perl and am having a heck of a time with it.

Here is the c# code that I am trying to emulate:
private static string Encrypt(string s) { HashAlgorithm provider = null; try { provider = new SHA1Managed(); byte[] bytes = Encoding.Unicode.GetBytes(s); provider.Initialize(); bytes = provider.ComputeHash(bytes); return Convert.ToBase64String(bytes); } finally { if (provider != null) provider.Clear(); } }
I've tried hundreds of variations on the below code trying to get the same results:
use strict; use MIME::Base64; use Digest::SHA1 qw/sha1 sha1_hex sha1_base64/; use Encode qw/encode decode/; print 'Hello World' . "\n"; print encode_base64(sha1('1234')) . "\n"; print decode("UCS-2BE", decode_base64('E59pyTwEJJao6VjsWTBmLGzMr78=')) +;
To say the least I'm a bit lost and would love a point in the right direction or a friendly bit of advice.
Here is a trace of the data as it comes through the c# code:
// s = '1234' private static string Encrypt(string s) { HashAlgorithm provider = null; try { provider = new SHA1Managed(); byte[] bytes = Encoding.Unicode.GetBytes(s); // bytes = 49, 0, 50, 0, 51, 0, 52, 0 provider.Initialize(); bytes = provider.ComputeHash(bytes); // bytes = 19, 519, 105, 201, 60, 4, 36, 150, 168, 233, + 88, 236, 89, 48, 102, 44, 108, 204, 175, 191 return Convert.ToBase64String(bytes); // bytes = E59pyTwEJJao6VjsWTBmLGzMr78= } finally { if (provider != null) provider.Clear(); } }

Thank you in advance and if I forgot to add any crucial piece of knowledge that would be helpful, let me know and I'll happily provide anything that would help this bit of a puzzle I have. Really am stumped this time...

UPDATE: Realized in my hacking at this to try and make it work my example didn't make a lot of sense. Updated to remove sillyness.

Replies are listed 'Best First'.
Re: Rewriting a C# Encryption Function
by jethro (Monsignor) on Aug 22, 2008 at 21:05 UTC
    my $data= pack('cccccccc',49,0,50,0,51,0,52,0); print sha1_hex($data); print sha1_base64($data);

    prints out 139f69c93c042496a8e958ec5930662c6cccafbf, which is the hex form of your result and E59pyTwEJJao6VjsWTBmLGzMr78

    So your c# encryption routine converts your string rather pointlessly IMHO to unicode and then hashes it. The pack method above has obvious limits, as soon as a char isn't in the same place on page 0 of the unicode charset your hash will be different again.

    Do you need the new hash conform to the old or was it just the correctness of the code that was important?

    UPDATE: http://perldoc.perl.org/Encode.html might provide methods to get at the underlying bytes of the unicode

      Getting closer to this. I think the problem I'm facing right now is in the Encoding to Unicode:
      use strict; use MIME::Base64; use Digest::SHA1 qw/sha1 sha1_hex sha1_base64/; use Encode qw/encode decode/; my $data = encode("UTF-16", 1234); $data = sha1($data); print encode_base64($data) . "\n";

        I note that the bytes which the C# is taking the SHA1 of are little-endian UTF-16:

        // s = '1234' .... byte[] bytes = Encoding.Unicode.GetBytes(s); // bytes = 49, 0, 50, 0, 51, 0, 52, 0
        so you're probably best off:
        my $data = encode("UTF-16LE", .....);
        On my x86 (little-endian) machine
        my $data = encode("UTF-16", '1234');
        gives $data bytes:
        0xFE 0xFF 0x00 0x31 0x00 0x32 0x00 0x33 0x00 0x34
        which is big-endian complete with BOM. Whereas:
        my $data = encode("UTF-16LE", '1234');
        gives $data bytes:
        0x31 0x00 0x32 0x00 0x33 0x00 0x34 0x00
        which is little-endian sans BOM; which appears to be what's required.

      The hash has to match with the old. IE if the data submitted is 1234 then the encrypted data/base64 needs to equal E59pyTwEJJao6VjsWTBmLGzMr78. How it gets there isn't as important.

      The perl function doesn't have to convert the data to unicode and then hash it, if it is simpler it skip certain parts than I'm more than happy to do that. Its the end result that has to match.

        Sadly you can skip unicode madness only if you know that the strings you have to convert are all from a very limited set of chars (a-z,A-Z,0-9 and a few special chars). Then using just the pack method would be enough.

        But the link I provided seems to have the solution on a platter:

        my $octets= encode("utf16", '1234'); $octets= substr($octets,2); for my $i (0..(length($octets)/2-1)) { my $n= substr($octets,$i*2,1); substr( $octets,$i*2,1)= substr($octets,$i*2+1,1); substr($octets,$i*2+1,1)=$n; } print sha1_base64($octets);

        prints out 'E59py...'. The really ugly for loop swaps the bytes of one 16 bit value since the encoding was swapped on my machine. If you don't get the same result, drop the for loop and check again.