efficiency & style

djw has asked for the wisdom of the Perl Monks concerning the following question:

I have a question about a working piece of code. I have used a bunch of foreach and while statements to get the result I need but it just looks ugly and I'm sure there is a better way of doing it.
Any tips on efficiency and/or style would be great.

#!/usr/bin/perl -w
use strict;

my(@tcpstatus,@tcpalive,$date,$line);
my($ftp,$www,$ssh,$irc);

$date = `date +"%l:%M:%S"`;
$ftp = 0;
$www = 0;
$ssh = 0;
$irc = 0;
@tcpalive = ();

############################
# get current tcp status   #
############################

@tcpstatus = `cat /proc/net/tcp`;


############################
# grab only the ones that  #
# are marked "Active" (01) #
# and put them into new    #
# array                    #
############################

foreach (@tcpstatus) {
    while (/\b01\s/) {
        push(@tcpalive,$_);
        last;
    }
}

############################
# ugly. I know there has   #
# got to be a better way   #
# to get this done.  Step  #
# through new array and    #
# add how many times each  #
# service is found         #
############################


foreach (@tcpalive) {
    while (/:0050\s/) {
        $www++;
        last;
    }
}

foreach (@tcpalive) {
        while (/:0016\s/) {
                $ssh++;
                last;
        }
}

foreach (@tcpalive) {
        while (/:1A0B\s/) {
                $irc++;
                last;
        }
}

foreach (@tcpalive) {
        while (/:0015\s/) {
                $ftp++;
                last;
        }
}


############################
# print this out to a file #
# used for server side     #
# includes                 #
############################

open (FILE, "+>/var/www/web/main.txt");
print FILE "There are $www web, $ftp ftp, $ssh ssh, and $irc irc conne
+ctions to this server
as of $date\n";
close FILE;
[download]

Thanks,
djw

Comment on efficiency & style Download Code

Replies are listed 'Best First'.
Re: efficiency & style by tye (Sage) on Sep 23, 2000 at 11:21 UTC
#!/usr/bin/perl -w use strict; $date = `date +"%l:%M:%S"`; # I'd use sprintf and localtime my %service= ( www => qr/:0050\s/, ssh => qr/:0016\s/, irc => qr/:1A0B\s/, ftp => qr/:0015\s/, ); my %count= qw( www 0 ssh 0 irc 0 ftp 0 ); foreach my $line ( grep /\b01\s/, `cat /proc/net/tcp` ) { foreach my $svc ( keys %service ) { $count{$svc}++ if $line =~ $service{$svc}; } } open (FILE, "+>/var/www/web/main.txt") or die "Can't write to /var/www/web/main.txt: $!\n"; print FILE "There are $count{www} web, $count{ftp} ftp, ", "$count{ssh} ssh, and $count{irc} irc connections ", "to this server as of $date\n"; close FILE; [download] - tye (but my friends call me "Tye")	[reply] [d/l]
RE (tilly) 2: efficiency & style by tilly (Archbishop) on Sep 23, 2000 at 12:04 UTC
When seeking to match a series of strings like this there is no reason not to push the loop down to the RE engine. First you: `my %service = qw( 0050 www 0016 ssh 1A0B irc 0015 ftp ); my $match_lst = join "\|", keys %service; my $match = qr/:($match_lst)\s/;` [download] and then you change the loop to: foreach my $line (grep /\b01\s/, `cat /proc/net/tcp` ) { ++$count{$service{$1}} while $line =~ /$match/g; } [download] If you want you can speed it up even more by optimizing the RE as I did in RE (tilly) 4: SAS log scanner.	[reply] [d/l] [select]
RE: RE (tilly) 2: efficiency & style by dchetlin (Friar) on Sep 23, 2000 at 14:13 UTC
Hmm. I disbelieve that using alternation is as efficient as looping over a list of patterns. I believe the following benchmark backs me up: `tilly gives: 1600 chetlin gives: 1600 Benchmark: running chetlin, tilly, each for at least 5 CPU seconds... chetlin: 9 wallclock secs ( 5.52 usr + 0.00 sys = 5.52 CPU) @ 33 +3.70/s (n=1842) tilly: 10 wallclock secs ( 5.09 usr + 0.00 sys = 5.09 CPU) @ 10 +4.52/s (n=532)` [download] Here's the code for it; do feel free to slap me around if I made a thinko: my @patterns=qw/foo bar baz blarch/; my $tilly=qr/(@{[join "\|",@patterns]})/; my @chetlin=map qr/$_/,@patterns; my $target="foo baz blarcy foo blarch"x400; sub tilly { my $count; $count++ while ($target =~ /$tilly/g); print STDERR "tilly gives: $count\n" if ((caller)[1]!~/eval/); } sub chetlin { my $count; for (@chetlin) {$count++ while ($target =~ /$_/g) } print STDERR "chetlin gives: $count\n" if ((caller)[1]!~/eval/); } tilly(); chetlin(); use Benchmark; timethese(-5, { tilly => \&tilly, chetlin => \&chetlin, }); [download] In general, my credo is to avoid alternation at all costs. I would be interested in seeing what a benchmark of your optimized alternation (ref. the pointer you gave above) would give. -dlc	[reply] [d/l] [select]
RE: RE: RE (tilly) 2: efficiency & style by tye (Sage) on Sep 23, 2000 at 18:52 UTC
RE (tilly) 4: efficiency & style by tilly (Archbishop) on Sep 23, 2000 at 20:26 UTC
RE: Re: efficiency & style by djw (Vicar) on Sep 24, 2000 at 19:04 UTC
Now see, I knew there was a better way. I have seen the => thing before and I can see how its being used here, but I don't really know what that is :) I really like how you used the hash to create the service keys at 0 value....slick. I will have to figure out that foreach loop on my own. I'll have to use another example and see if I can understand how its being used. I should have been using the die statement in the first place. I know better than that. I really appreciate the input. Thanks, djw	[reply]
Stolen thunder by dchetlin (Friar) on Sep 23, 2000 at 11:45 UTC
I'm a little frightened by the eerie similarity between the code posted by tye above and what I worked up after I saw the original post. I post mine here anyways because I don't want to waste it. I was even planning a comment about sprintf/localtime! The only difference is the way we did the loop. #!/usr/bin/perl -w use strict; my %services = (www => qr/:0050\s/, ssh => qr/:0016\s/, irc => qr/:1A0B\s/, ftp => qr/:0015\s/, ); my $date = `date +"%l:%M:%S"`; my @tcpalive = grep /\b01\s/,`cat /proc/net/tcp`; while (my($key,$REx) = each %services) { $services{$key} = grep /$REx/,@tcpalive } open (FILE, "+>/var/www/web/main.txt") or die $!; print FILE "There are $services{www} web, $services{ftp} ftp, ", "$services{ssh} ssh, and $services{irc} irc connections ", "to this server as of $date\n"; close FILE or die $!; [download] In general, `grep' is your friend. What you had was not at all bad, though, just a little un-Perlish. Seeing -w and strict warms my heart :-)	[reply] [d/l]
Re (tilly) 1: efficiency & style by tilly (Archbishop) on Sep 23, 2000 at 12:33 UTC
Looking at the contents of `/proc/net/tcp` it appears to be a formatted report. Personally I would first do a sanity check that I know what the header line is (makes porting the code later easier) then just extract substrings using substr or unpack and do hash lookups for the action. Much more efficient than looping. #! /usr/bin/perl -w use strict; @tcp_status = `cat /proc/net/tcp`; unless (" st " eq substr($tcp_status[0], 33, 4) and " rem_address " eq substr($tcp_status[0],19, 15)) { die "/proc/net/tcp has an unexpected format"; } my %service = qw( 0050 www 0016 ssh 1A0B irc 0015 ftp ); my %count = map {($_, 0)} values %service; foreach my $line (@tcp_status) { if ("01" eq substr($line, 34, 2)) { my $serv_code = substr($line, 29, 4); if (exists $service{$serv_code}) { ++$count{$service{$serv_code}}; } } } # Finish as Tye did [download] (These numbers are for a 2.0 series Linux kernel.)	[reply] [d/l]
Re: efficiency & style by gregorovius (Friar) on Sep 23, 2000 at 12:03 UTC
You can substitute this: `foreach (@tcpstatus) { while (/\b01\s/) { push(@tcpalive,$_); last; } }` [download] for this: `@tcpalive = grep { /\b01\s/ } @tcpstatus;` [download]	[reply] [d/l] [select]