I am trying to benchmark some code to try to figure out where I need to optimize it. In the real script, I am using POE and recieving input from libbt's btget and filtering it through several regular expressions and other stuff to collect the information to do different things with it.
What I have done here is split up a single sub from the real script into different subs that contain the individual parts so I could track down what was slow and see as I make improvements. However, it seems that I can run this same test script 3 times and get 3 different results as to which sub is faster and which is slower, but they are always very close.
Am I just using Benchmark wrong, or what have I screwed up?
I have tried this on 2 different systems, one OpenBSD 3.6 with perl 5.8.5 and a Windows 2000 box with perl 5.8.0 (ActivePerl build 806) with similar results.
Benchmarks:
Benchmark: timing 1000000 iterations of i_noop, i_peers, i_sep... i_noop: 13.0096 wallclock secs (10.13 usr + 2.89 sys = 13.02 CPU) + @ 76804.92/s (n=1000000) i_peers: 13.0293 wallclock secs (10.17 usr + 2.85 sys = 13.02 CPU) + @ 76804.92/s (n=1000000) i_sep: 12.6381 wallclock secs ( 9.72 usr + 2.92 sys = 12.64 CPU) + @ 79113.92/s (n=1000000) Rate i_peers i_noop i_sep i_peers 76805/s -- -0% -3% i_noop 76805/s 0% -- -3% i_sep 79114/s 3% 3% -- Benchmark: timing 1000000 iterations of i_noop, i_peers, i_sep... i_noop: 14.0463 wallclock secs (11.13 usr + 2.91 sys = 14.04 CPU) + @ 71225.07/s (n=1000000) i_peers: 14.6951 wallclock secs (11.36 usr + 3.34 sys = 14.70 CPU) + @ 68027.21/s (n=1000000) i_sep: 15.484 wallclock secs (12.20 usr + 3.29 sys = 15.49 CPU) +@ 64557.78/s (n=1000000) Rate i_sep i_peers i_noop i_sep 64558/s -- -5% -9% i_peers 68027/s 5% -- -4% i_noop 71225/s 10% 5% -- This one is from the Windows box: Benchmark: timing 1000000 iterations of i_noop, i_peers, i_sep... i_noop: 26 wallclock secs (15.97 usr + 7.00 sys = 22.97 CPU) @ 43 +536.94/s (n=1000000) i_peers: 26 wallclock secs (16.84 usr + 6.30 sys = 23.14 CPU) @ 43 +215.21/s (n=1000000) i_sep: 23 wallclock secs (16.16 usr + 6.61 sys = 22.76 CPU) @ 43 +927.08/s (n=1000000) Rate i_peers i_noop i_sep i_peers 43215/s -- -1% -2% i_noop 43537/s 1% -- -1% i_sep 43927/s 2% 1% --
#!/usr/bin/perl use strict; use warnings; use Benchmark qw( timethese cmpthese :hireswallclock ); my $count = 1_000_000; my $result = timethese($count, { 'i_sep' => sub { my %heap; while (<DATA>) { i_sep( \%heap, +$_) } }, 'i_peers' => sub { my %heap; while (<DATA>) { i_peers(\%heap, +$_) } }, 'i_noop' => sub { my %heap; while (<DATA>) { i_noop( \%heap, +$_) } }, }); cmpthese($result); sub i_sep { my ($heap, $result) = @_; chomp $result; my $task = $heap->{task}; if ($result eq '----') { # do nothing } else { #print $result, "\n"; } } sub i_peers { my ($heap, $result) = @_; chomp $result; my $task = $heap->{task}; if ($result =~ / ^\s+(\d+) # 6 \s+([\d\.]+) # 192.168.1.156 :(\d+) # :0 \s+\((\w+)\) # (ALL) \[\w+\s+\w+ # [Cibr 1s \(\s*(\w+)\) # (122bps) \^-([\d\+]+) # ^-001+0 \] # ] \[\w+\s+\w+^? # [Cibr 1s \(\s*(\w+)\) # (122bps) _\d+ # _0 \] # ] /x ) { my $peer_id = $1; my %peer = ( Time => $task->{status}->{Time}, peer_id => $peer_id, ip => $2, port => $3, complete => $4, speed => $5, offset => $6, speed_2 => $7, offset_2 => $8, ); $task->{peers}->{$peer_id} = \%peer; } else { #print $result, "\n"; } } sub i_noop { my ($heap, $result) = @_; } __DATA__ Total good pieces 390 (100%) Total archive size 102225004 completed+00000:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff completed+00024:ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff completed+00048:fc get http://OpenBSD.somedomain.net/announce.php?info_hash=%2b%d9%df%da% +0a%14%d4%2 f%cc%b7%0c%af%39%b5%7d%ac%64%4a%05%d1&peer_id=%33%33%c8%98%29%80%d5%70 +%93%46%75% ae%e2%3f%3b%7d%cb%78%71%4f&key=%ee%58%0f%93%9e%00%11%56&port=6881&uplo +aded=0&dow nloaded=0&left=0&event=started&compact=1 Interval 1800 3: Server ready... 5: New peer connected 192.168.1.156 Time 1115237207 4 192.168.1.156:6881 (OUT)[Cibr 1s( 20bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 1 Peers, Download 0bps Upload 0bps^M ---- Time 1115237208 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237209 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237210 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237211 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237212 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237213 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237214 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237215 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ---- Time 1115237216 4 192.168.1.156:6881 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 5 192.168.1.156:0 (ALL)[Cibr 1s(122bps)^-001+0][Cibr 1s(122 +bps)_0] 100% (390 of 390) 2 Peers, Download 0bps Upload 0bps^M ----
In the real script, the stuff between the 'Time' line and the '----' line just keeps repeating (sometimes with more peers, depending on who is connected, but that isn't really important for this discussion). With 67 torrents seeding, at 5 lines each per second, that is at least 335 lines per second, and the real script uses about 25% cpu on a celeron 700 (OpenBSD 3.6 - perl, v5.8.5 built for i386-openbsd)
Edit by tye: Changed PRE to CODE around long lines
In reply to Benchmark doesn't tell me what I want to hear by afresh1
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |