comment on

I was recently enjoying(sic) seeing 'script kiddies' trying to scan the ports on my home box. I wrote a script to parse out the packet DENY entries from my messages log file and realized that I should sort the list of IP's to make them easier to read.

Thoughts of complex sorting started to form in my head: split up the quad's and start comparing. I started to look for some elegant ways of doing this, when I realized how much more simple (and potentially more efficient) it would be to pack each IP into it's hex form. "This has to be a much, much faster way", I thought to myself.

So I did a search at Google and found this page written by Uri Guttman and Larry Rosler. They discuss this very thing - giving one example that sorts by breaking up the IP bytes and comparing them, and another example that packs the IP's into hex.

Here are their two examples, slightly paraphrased:

#naive: O(N*logN)
my @sorted = sort {

    my @a = $a =~ /(\d+)\.(\d+)\.(\d+)\.(\d+)/;
    my @b = $b =~ /(\d+)\.(\d+)\.(\d+)\.(\d+)/;
    
    $a[0] <=> $b[0] || 
    $a[1] <=> $b[1] || 
    $a[2] <=> $b[2] || 
    $a[3] <=> $b[3]

} @unsorted;

#packed: O(N*logN)
my @sorted = sort {

    pack('C4' => $a =~ /(\d+)\.(\d+)\.(\d+)\.(\d+)/)
    cmp
    pack('C4' => $b =~ /(\d+)\.(\d+)\.(\d+)\.(\d+)/)

} @unsorted;
[download]

What gets me is the fact that they have the same Big(O), but when benchmarked:

Benchmark: timing 5000 iterations of naive, packed...
     naive: 11 wallclock secs ( 9.91 usr +  0.32 sys = 10.23 CPU)
    packed:  8 wallclock secs ( 7.95 usr +  0.08 sys =  8.03 CPU)
[download]

there is an obvious difference.

To be honest, I really hated Big(O)analysis in college, and I think this is the very reason why. There is an obvious amount of better efficiency in the packed version, but in a constant way - this of course is eliminated when doing Big(O), making both algorithms 'equal'.

I am posting this mainly because I couldn't find any good IP sorting algorithms on PM, but I was also wondering how other Monks out there feel about the Big(O).

Jeff Anderson
tyring to be a better programmer every day . . .

UPDATE: Fri Aug 4 09:31:30 CDT 2000
Thanks to everyone for helping me understand the error in my ways - this whole time I was trying to compare apples to oranges, no wonder I had such a hard time understanding O(n). Big thanks to gryng for such a wonderful and heart-felt explaination. You rule.

In reply to Sorting a list of IP addresses (aka Why I hate Big O) by jeffa

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


more useful options
	PerlMonks