Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Re: Re: Re: hash collision DOS

by iburrell (Chaplain)
on Jun 02, 2003 at 21:32 UTC ( [id://262500]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Re: hash collision DOS
in thread hash collision DOS

It doesn't have anything to do with the Vars() method. The worst-case performance of the hash slows down insertion of elements. CGI.pm inserts parameter names as keys when parsing the query string.

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: hash collision DOS
by crazyinsomniac (Prior) on Jun 02, 2003 at 23:22 UTC
    Look at this
    C:\>perl -MData::Dumper -MCGI -e"die Dumper( CGI->new( { 1..10 }) ) " $VAR1 = bless( { '.charset' => 'ISO-8859-1', '1' => [ 2 ], '3' => [ 4 ], 'escape' => 1, '5' => [ 6 ], '7' => [ 8 ], '.parameters' => [ '7', '9', '1', '3', '5' ], '9' => [ 10 ], '.fieldnames' => {} }, 'CGI' ); C:\>
    Do all the keys map to one bucket? I'm pretty sure they don't.
    C:\>perl -MData::Dumper -MCGI -e"warn Dumper( $a=CGI->new );die Dumper +(scalar $a->Vars() ) " a= b= c= d= e= f= g= $VAR1 = bless( { '.charset' => 'ISO-8859-1', 'a' => [ '' ], 'escape' => 1, 'b' => [ '' ], 'c' => [ '' ], 'd' => [ '' ], 'e' => [ '' ], 'f' => [ '' ], 'g' => [ '' ], '.parameters' => [ 'a', 'b', 'c', 'd', 'e', 'f', 'g' ], '.fieldnames' => {} }, 'CGI' ); $VAR1 = { 'a' => '', 'b' => '', 'c' => '', 'd' => '', 'e' => '', 'f' => '', 'g' => '' };
    The only way all the various keys would map to one bucket would be when you construct a hash by calling Vars. Do you see what I'm saying, or did I miss something crucial.
    update: I improved the 2nd snippet. Now it shows all the various keys going to a single bucket.

    update: Why not just post the example?

     
    ______crazyinsomniac_____________________________
    Of all the things I've lost, I miss my mind the most.
    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

      The keys aren't going into a single bucket above. The Vars() is just stripping out the non-parameter keys that CGI.pm stores inside of itself. The 'a', 'b' parameters are there in the object.

      The way to determine what is happening inside a hash is evaluating it in scalar context. That gives you the number of buckets being used. tilly wrote a program that uses this feature to generate a list of colliding keys. This algorithm is fast and doesn't depend on reverse engineering the Perl hash algorithm.

      I ran some tests on a 10,000 keys generated by tilly's method. Both inserting them into a hash and parsing the query string with CGI. It takes over 20 seconds to parse the query string in the pathological case versus less than a second for 10,000 normal strings. I haven't been willing to wait long enough to let 100,000 strings run. For a sample, here are the first 10 integers that collide and the scalar hash value showing they all go in one bucket.

      8 14 22 30 38 46 54 62 70 78 86 1/8

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://262500]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2024-04-24 17:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found