Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

hash collision DOS

by kschwab (Vicar)
on Jun 01, 2003 at 12:29 UTC ( #262191=perlquestion: print w/replies, xml ) Need Help??

kschwab has asked for the wisdom of the Perl Monks concerning the following question:

There's an interesting article on /. today about using hash collisions to create a denial of service. The white paper referred to in the article is a bit light on details, but I did find the premise interesting.

They specifically mention attacking Perl's hash implementation, including specific attacks for 5.6.1 and 5.8.0.

An obvious defense is to avoid putting untrusted data into a hash. Sounds easy, but associative arrays are probably already being used this way all over the place. Any ideas on workarounds and fixes to reduce the risk of being DOS'ed ?

Replies are listed 'Best First'.
Re: hash collision DOS (patch)
by tye (Sage) on Jun 02, 2003 at 04:53 UTC

    Interesting timing. Just a couple of weeks back p5p accepted my resubmission of a patch I originally submitted years ago that never went anywhere.

    So in Perl 5.8.0 and earlier (and perhaps a version of two after 5.8.0), it isn't hard at all to come up with a space of keys that will all get stuck into the same bucket when you put them in a hash. In fact, you can use 1/8th of all possible strings and end up with a hash that contains 8 buckets, 7 of which are empty, and the 8th of which contains as large of a linear linked list as you want.

    In versions of Perl that have my patch applied, the more keys you add, the harder it is to find keys that fall into the same bucket. If you want 1,000 keys to fall into the same bucket, then you have to be very picky, only using 0.1% of possible strings. So it is like picking out 1,000 keys out of 1,000,000 keys. If you want 1,000,000 keys in the same bucket, then you can only use 1/1,000,000 of strings so it is like picking out your 1M keys from a space of 1,000,000,000,000 strings.

    It looks like the slashdot article isn't even aware of this problem, however. They actually go to the work of finding keys that will get thrown into the same bucket no matter how many buckets get allocated. So they could greatly simplify their work.

                    - tye
Re: hash collision DOS
by krisahoch (Deacon) on Jun 01, 2003 at 13:44 UTC


    Please don't forget to consider the 'victim program' source

    $start=time; while (<>) { chomp; $testhash{$_}="x"; $i++; if (0==($i % 100)) { print "$i "; print (time-$start); print " "; $_ = scalar(%testhash); ($used,$total) = split '/'; print "$used $total\n"; } }
    If there is anyone out there who write programs like this, then be wary for DoS attacks from Rice university;)

    Update: Changed grammar of two sentences to sound less like a neanderthal

    Kristofer Hoch

    Si vos can lego is, vos es super erudio

Re: hash collision DOS
by crazyinsomniac (Prior) on Jun 01, 2003 at 13:07 UTC
      It's not just dumping a hash structure that causes it. Solutions would including things like limiting the total number of hash elements, or perturbing the input data in a less predictable way.

      The white paper is a bit short on details, but I'm not sure I'd characterize it as a "non-issue".

      Update:See this for more detail and example exploits.

      But doesn't CGI store fields internally as a hash?

      From CGI 2.81:

      sub param { my($self,@p) = self_or_default(@_); return $self->all_parameters unless @p; my($name,$value,@other); #~~~~~Snip~~~~~ ($name,$value,@other) = rearrange([NAME,[DEFAULT,VALUE,VALUES]],@p +); #~~~~~Snip~~~~~ return wantarray ? @{$self->{$name}} : $self->{$name}->[0]; }

      It also looks like CGI::Simple does the same:

      From CGI::Simple 0.06

      sub param { my ( $self, $param, @p ) = @_; #~~~~~Snip~~~~~ return wantarray ? @{$self->{$param}} : $self->{$param}->[0]; }

      How do I love -d? Let me count the ways...
Re: hash collision DOS
by astaines (Curate) on Jun 01, 2003 at 23:01 UTC

    As far as I can tell from the article, what they have shown is that it is possible to exploit worst case behaviour of hashing algortihms to consume resources and that this exploit can be used against the hashing algortihms in both Perl 5.8.0 and 5.6.1. (And in many other programs)

    What happpens when such an attack succeeds is that hash performance goes from O(n) to O(n^2). This matters a lot if n is several tens of thousands, but not at all if n is around 10 or 100. You can exploit this if (and only if) you can send large amounts of data to a program which then gets stored in a hash in the program.

    To fix this you just need to patch perl to use a different hashing algorithm. I leave this as an exercise for you...

    Anthony Staines

Re: hash collision DOS
by Jenda (Abbot) on Jun 02, 2003 at 10:52 UTC

    Well maybe and/or CGI::Lite could let us restrict the CGI parameters that are accepted and stored and throw away all others. Why should the $CGI object remember the parameters we are not interested in anyway?

    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

    Edit by castaway: Closed small tag in signature

      1. See the fine manual: you can already ->delete() parameters, so just grep unrequested parameters out of ->param() and dump them in the bit bucket.
      2. All webservers have a relatively tight maximum size for GET requests. (I think the default is something like 4kb for Apache.) You can set $CGI::POST_MAX for POST requests.
      Use those well and it shouldn't be possible to dump enough data on a script to slow it down significantly.

      Makeshifts last the longest.

        Calling delete would happen after the problem has already occured. I concur, if $ENV{QUERY_STRING} length bothers you, simply cut it down (same goes for POST_MAX).

        I do feel a nice addition would be a something like

        acceptOnly( thesekeys => qw[ these keys ] ); acceptOnly( thismanykeys => 44 );
        This would be trivial to add ... just a thought

        MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
        I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
        ** The Third rule of perl club is a statement of fact: pod is sexy.

        PodMaster is right. ->delete() comes too late. And even the $CGI::POST_MAX doesn't help much.

        Imagine you have a file upload script. There you need to keep the $CGI::POST_MAX rather high so they may be able to post quite a few CGI parameters. And then even the creation of the hash that uses to store data may take a lot of time. And the grep and delete would only make the issue worse.

        Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
           -- Rick Osborne

        Edit by castaway: Closed small tag in signature

Re: hash collision DOS
by tilly (Archbishop) on Jun 03, 2003 at 03:35 UTC
    It is quite easy to solve the problem permanently. Just use another fast structure, like a BTREE, and the possibility of pathological cases becomes impossible. Alternately one can use a hash, and then add logic to detect when a hash bucket has an abnormally long linked list, then switch it on the fly to a more robust structure.

    The problem is that the average performance of hashes are very, very hard to beat, and any kind of paranoid safety logic will be hit all of the time, slowing Perl down a bit on virtually every step. Odds are that nobody in the wild has actually been hurt by Perl's hashing behaviour (yet). I guarantee that people will wonder why Perl slowed down overall. When do you make the engineering decision that average case performance matters more this time than worst case performance?

      I commented on a possible 'fix' for the potential problem with perls current hashing function at Re: Hash Clash on purpose and would be interested on your assessment of the notion.

      If that isn't a fix, or cannot be implemented without undue penalty, I wonder if there isn't at least some mileage in having a new warning, along the lines of the 'Deep recursion' warning, that simply notes that a hash is exhibiting anomolous or 'worst case' behaviour at runtime. It wouldn't fix anything, but its presence in the logs could go a long way in trying to track down such behaviour down?

      Maybe a corresponding pragma that allowed the 'hash quality' threshold that would trigger the warning to be specified?

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

        I would have to examine the hash function in some detail to tell whether or not that would work as a fix (at a minimal performance loss and the space overhead of having to store the initial value for each hashing function in the hash).

        Deep recursion checks in the list traversal logic is exactly the solution that I thought was likely to slow things down. Perl spends a lot of its time doing hash lookups, and any overhead (eg having to use up an extra register to track how deep the list went) is likely to show performance problems.

        But thinking about the issue, I am not sure that a clever person couldn't manage to add such a check in some way that didn't cause significant performance problems. I simply have not worked close enough to the metal enough to have a sense of that. If you could add the performance check, though, there is no reason that you couldn't just fall back into a more complex and robust data structure instead of a linked list. The immediate algorithmic overhead is paid for by the fact that you only use it when you really need it.

Re: hash collision DOS
by comand (Acolyte) on Jun 01, 2003 at 18:52 UTC
    How about -T?
      And that's going to help how?

      Makeshifts last the longest.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://262191]
Front-paged by krisahoch
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2023-03-28 15:35 GMT
Find Nodes?
    Voting Booth?
    Which type of climate do you prefer to live in?

    Results (67 votes). Check out past polls.