Re^3: *Fastest* way to print a hash sorted by value (less Perl)

You might try using external programs that are optimized to deal with such situations efficiently. I wish standard 'sort' knew how to include a count when doing 'sort -u' (like 'uniq -c' does). But it might be worth trying to have your Perl script just extract the IP addresses and output "10.0.0.10 192.168.0.10\n" via a pipe to "sort | uniq -c | sort +n | head -$N".

- tye

Comment on Re^3: Fastest way to print a hash sorted by value (less Perl)

Replies are listed 'Best First'.
Re: Re^3: Fastest way to print a hash sorted by value (less Perl) by zengargoyle (Deacon) on Aug 09, 2003 at 03:08 UTC
it may be overkill or underkill but if the statistics that are needed are contained in the tcp/udp header (srcip, dstip, srcport, dstport, length, flags, ...) say for counting the number of connections/packets/data/protocol between hosts then use flow-tools and Cflow. parse the firewall logs with Perl using the Cflow module to convert to netflow format, it holds data you find in the IP headers and some more. then use the flow-tools to process them. there is a learning curve, but it's worth it. the simplest Perl version of say counting the number of packets and bytes transfered between a given pair of hosts over enough time/data so that the script takes 1 minute to run. well, the equivalent flow-tools command will do the same in 2 seconds. it does nice text-based reports, scan detection, powerfull filtering and there's excellent support on the mailing list. if you're counting based on the data in the packet like specific intrusion attempt types then flow-tools won't do much good. if you have lots of header type data flow-tools rocks.	[reply]

Replies are listed 'Best First'.

Re: Re^3: *Fastest* way to print a hash sorted by value (less Perl)
by zengargoyle (Deacon) on Aug 09, 2003 at 03:08 UTC

it may be overkill or underkill but if the statistics that are needed are contained in the tcp/udp header (srcip, dstip, srcport, dstport, length, flags, ...) say for counting the number of connections/packets/data/protocol between hosts then use flow-tools and Cflow.

parse the firewall logs with Perl using the Cflow module to convert to netflow format, it holds data you find in the IP headers and some more. then use the flow-tools to process them. there is a learning curve, but it's worth it.

the simplest Perl version of say counting the number of packets and bytes transfered between a given pair of hosts over enough time/data so that the script takes 1 minute to run. well, the equivalent flow-tools command will do the same in 2 seconds. it does nice text-based reports, scan detection, powerfull filtering and there's excellent support on the mailing list.

if you're counting based on the data in the packet like specific intrusion attempt types then flow-tools won't do much good. if you have lots of header type data flow-tools rocks.

[reply]