in reply to Re: Would Like Recommendation for an SHA256 module
in thread Would Like Recommendation for an SHA256 module

You might be interested in the technique used in my challenge

Unfortunately, it seems that satisfying the terms of your conditions might be about as difficult as finding a collision. Here's the full code I ran (based on 2 different text strings supplied earlier by ikegami):
use warnings; use strict; use Digest::MD5 qw( md5_hex ); my $text1 = "\xA6\x64\xEA\xB8\x89\x04\xC2\xAC" . "\x48\x43\x41\x0E\x0A\x63\x42\x54" . "\x16\x60\x6C\x81\x44\x2D\xD6\x8D" . "\x40\x04\x58\x3E\xB8\xFB\x7F\x89" . "\x55\xAD\x34\x06\x09\xF4\xB3\x02" . "\x83\xE4\x88\x83\x25\x71\x41\x5A" . "\x08\x51\x25\xE8\xF7\xCD\xC9\x9F" . "\xD9\x1D\xBD\xF2\x80\x37\x3C\x5B" . "\x97\x9E\xBD\xB4\x0E\x2A\x6E\x17" . "\xA6\x23\x57\x24\xD1\xDF\x41\xB4" . "\x46\x73\xF9\x96\xF1\x62\x4A\xDD" . "\x10\x29\x31\x67\xD0\x09\xB1\x8F" . "\x75\xA7\x7F\x79\x30\xD9\x5C\xEB" . "\x02\xE8\xAD\xBA\x7A\xC8\x55\x5C" . "\xED\x74\xCA\xDD\x5F\xC9\x93\x6D" . "\xB1\x9B\x4A\xD8\x35\xCC\x67\xE3"; my $text2 = "\xA6\x64\xEA\xB8\x89\x04\xC2\xAC" . "\x48\x43\x41\x0E\x0A\x63\x42\x54" . "\x16\x60\x6C\x01\x44\x2D\xD6\x8D" . "\x40\x04\x58\x3E\xB8\xFB\x7F\x89" . "\x55\xAD\x34\x06\x09\xF4\xB3\x02" . "\x83\xE4\x88\x83\x25\xF1\x41\x5A" . "\x08\x51\x25\xE8\xF7\xCD\xC9\x9F" . "\xD9\x1D\xBD\x72\x80\x37\x3C\x5B" . "\x97\x9E\xBD\xB4\x0E\x2A\x6E\x17" . "\xA6\x23\x57\x24\xD1\xDF\x41\xB4" . "\x46\x73\xF9\x16\xF1\x62\x4A\xDD" . "\x10\x29\x31\x67\xD0\x09\xB1\x8F" . "\x75\xA7\x7F\x79\x30\xD9\x5C\xEB" . "\x02\xE8\xAD\xBA\x7A\x48\x55\x5C" . "\xED\x74\xCA\xDD\x5F\xC9\x93\x6D" . "\xB1\x9B\x4A\x58\x35\xCC\x67\xE3"; if($text1 ne $text2){print "Texts are different\n"} else {print "BOZO ... the texts are the same\n"} print length($text1), " ", length($text2), " ", md5_hex($text1), "\n"; if(md5_hex($text1) eq md5_hex($text2)){print "The 2 hashes are the sam +e\n"} else {print "BOZO ... the hashes are different\n"} $text1 .= ' ' . md5_hex($text1); $text2 .= ' ' . md5_hex($text2); print length($text1), " ", length($text2), " ", md5_hex($text1), "\n"; if($text1 ne $text2){print "Texts are still different\n"} else {print "BOZO ... the texts are the same\n"} if(md5_hex($text1) eq md5_hex($text2)){print "The 2 hashes are still t +he same\n"} else {print "BOZO ... the hashes are different\n"}
$text1 and $text2 are different but have the same hash (let's call it $hash). The hash of $text1 . $hash is still the same as the hash of $text2 . $hash.

I note that the original strings in the code above have a length of 128 - and perhaps that's critical to the existence of such a solution. I suspect that the same solution does not apply to the particular phrase that you chose for your challenge, because its length is not 128 (or a multiple thereof). But if you like to generalise your challenge a little, I reckon I could submit the above code and get a free lunch.

Cheers,
Rob

Replies are listed 'Best First'.
Re^3: Would Like Recommendation for an SHA256 module
by BrowserUk (Patriarch) on Aug 02, 2006 at 13:17 UTC

    The possibility of finding two lumps of random garbage, even with equal length, has *always* been a given with any mechanism that represents a larger range of possible inputs with a smaller range of possible outputs.

    With an input space of 128256 = 2790951116e530

    And an output space of 2128 = 3.4e38, it could not be otherwise. There have to be at least 8 inputs for every output.

    Maybe you missed the emphasis I placed in my post?

    The effect is to considerably increase the difficulty of finding an alternative text that matches both the outer and inner md5 and renders a useful (to the bad guy), alternative text.

    To generalise my challenge in the way you suggest would be to ignore the point I was trying to make, and that ikegami partially made subsequent to his first post--

    • it's one thing to brute force two texts with the same md5.
    • It is an entirely different scale of problem to find a second text that has the same md5 as some existing text.
    • Much harder still to generate a second text, that say's something meaningful and useful that has the same md5 as an original text (derived from a 3rd party source).
    • And finally, generating a second text, that is meaningful and useful to your nefarious purpose, that satisfies the criteria of have the same md5 as the original 3rd party text when it own md5_hex digest is concatenated to it.

    Each of those requirments has a multiplier effect upon the difficulty of the task at hand for the bad guy. It is this same multiplier effect that stuff like double-DES and triple-DES exploit for their greater security.

    So sorry, but your gonna have to work a little for that free lunch :)


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      It's one thing to brute force two texts with the same md5

      Indeed it is - and there's a Win32 executable at http://cryptography.hyperlink.cz/2006/program_v1_pd.zip that creates collisions in approximately 30 seconds - though I doubt it does this by using brute force.

      I see the shortcoming of your challenge as follows:
      You allow padding of the string - so, instead of having:
      $text = 'the quick brown fox jumps over the lazy dog';
      let's have it so that:
      $text1 = 'the quick brown fox jumps over the lazy dog' . ' ' x 85;
      $text1 now has a length of 128 bytes, and a hex digest (let's call it $hash) of 8ba2a86e374afd2aefc8e5378f9149a2. I can now get a free lunch if I can find another 128-byte string (let's call it $text2) that hashes to 8ba2a86e374afd2aefc8e5378f9149a2. That's not straightforward (for me, anyway), and even less straightforward if $text2 has to be "meaningful" - but the thing is that your requirement of having to hash $text1 . $hash has not made the task any more difficult.

      If both $text2 and $text1 hash to the same value (ie, to $hash), then $text1.$hash and $text2.$hash both hash to the same value. And the code I posted demonstrates that. If the string is 128 bytes long I don't believe the "multiplier effect" of which you speak exists. (If the string is, say, 119 bytes long, then there quite possibly is a "multiplier effect".)

      Maybe a sandwich and a cup of tea ? ... the sandwich is optional ... so is the cup of tea :-)

      Cheers,
      Rob

        Let's deal with 128 8-bit char strings.

        For any given 16-byte md5 hash, there are (on average) 8 X 128-byte strings that will render that md5 digest.

        One of these 128-byte strings is the original text.

        Therefore the task is to find one of the other seven 128-byte strings that also generates the original md5, and also happens to be meaningful for your nefarious purposes.

        My criteria states that the last 32-bytes of the 128-byte alternative text also happen to be hex digits.

        That reduces the possible alternative texts by 32256 / 3222 by virtue of the fact that the last 32 characters have to be hex digits (0-9, a-f, A-F = 22).

        Further, the fact that the remaining 96 bytes of the alternative text have to have an md5 that matches the 32 hex digits,and be meaningful for your nefarious purposes, again severally restricts the possibility that such a text exists, regardless of how hard it is for you to find it.

        The thing that is being missed is that there are very few texts of any given length that will produce a given md5.

        Even if you can use brute force to find them all, the probability any one of them will actually be readable english, (or executable code), never mind that it could be useful for your purposes, are fleetingly small.

        Ignoring the criteria of the challenge is a little like entering a 9x9 sudoku game, but offering to only complete the center 3x3 part of the grid. It simplifies things a lot.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        but the thing is that your requirement of having to hash $text1 . $hash has not made the task any more difficult.

        Sorry, but you are wrong. You'll note that in my challenge, the option to pad or truncate the message comes with the additional criteria that you correct my typo. For achieving this additional goal you get to eat at my favorite restaurant.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.