Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am trying to fix a little utility i created to diaply the `netstat -an` info and maybe play with that info by i don't know, getting a hostname. The script works fine but it's not working as i expected. Heres the code:

#! D:\perl\bin\perl # # CGI script to display the status of tcp/udp/raw connections on the s +erver. # Will give you a warning is there are any SYN'ers. BEGIN { use strict; print "Content-Type: text/plain\n\n"; use CGI::Carp 'fatalsToBrowser'; use warnings; } &netstat('-an'); sub netstat { my $args = shift; my $line; my @lines = `netstat $args`; print "$0: running \'netstat $args\'\n\n"; my($prot,$laddr,$lport,$eaddr,$eport,$status,$syn); LOOP: foreach $line (@lines) { $_ = $line; next LOOP if /^Active Connections/; next LOOP if /^$/; next LOOP if /^\s+Proto/; if(/^\s+(.*)\s+(.*):(.*)\s+(.*):(.*)\s+(.*)/) { $prot = $1; $laddr = $2; $lport = $3; $eaddr = $4; $eport = $5; $status = $6; $syn = 0; if($status =~ /syn/i) { $syn = $status; } print "\nwarning: $status! I think we're being SYN'ed\n\n" + if $syn; print "Local: $laddr:$lport - External: $eaddr:$eport - ", +$syn || $status,"\n"; } } }

All i need to do is clean it up and get rid of the spaces. Heres a sample netstat line from the script: Local: 0.0.0.0:7              - External: 0.0.0.0:0              LISTENING -

And heres one from the netstat command TCP    0.0.0.0:81    *:*    LISTENING

Any pointers and that kinda stuff would be great.

Cheers

Elfyn McBratney

Edit kudra, 2002-06-08 Changed title

Replies are listed 'Best First'.
Re: problem with regex
by stefp (Vicar) on Jun 08, 2002 at 04:50 UTC
    I have a Unix box so I can't test the regexp after modification. So I have to better explain the motivations behind the modified regex if it needs further tweaking.

    First avoid .* that is greedy and leads to uneeded backtracking or may lead to match more than intended. .*? is the non greeedy equivalent. Better, be more specific in your regexps. Here, using \S+ make you sure to capture only non blanks.

    I have added the x modifier to your regexp to vertically align your regexp with mine. Also, I don't know the format output from your netstat so I defensively added \s* around colons in case there is indeed blanks around colons in the input. They may not be needed.
    Also I used (\S+) to make clear that I expect non empty capture. If some capture may be empty use (\S*) instead.

    /^\s+(.*) \s+(.*) : (.*)\s+ (.*) : (.*)\s+ (.*)/x /^\s+(\S+)\s+(\S+)\s*:\s*(\S+)\s+(\S+)\s*:\s*(\S+)\s+(\S+)/

    Then there is the temptation to factorize:

    my $qr = q|(\S+)\s+(\S+)\s*:\s*(\S+)|;

    The regexp would become: <code>/^\s+$qr\s+$qr\s+(\S+)/<code>

    But you can't do it because the captures done by $qr are invisible from the regexp where it is interpolated.

    -- stefp -- check out TeXmacs wiki

Re: problem with regex
by grep (Monsignor) on Jun 08, 2002 at 07:00 UTC
    /^\s+(.*)\s+(.*):(.*)\s+(.*):(.*)\s+(.*)/

    Whenever I see a regex like this I start thinking about split or unpack (if it's more fixed length).

    An example with split:
    @foo = split /\s+/; ($laddr,$lport) = (split(/:/,$foo[1]),split(/:/,$foo[2]));


    grep
    Just me, the boy and these two monks, no questions asked.
Re: problem with regex to extract netstat info
by Util (Priest) on Jun 08, 2002 at 22:10 UTC
    Note: ++stefp, who already covered part of this.

    Regex comments:

    1. Change * to + where you require input (all but the last parameter).
    2. Change . to \S where you mean "non-whitespace".
      This vastly reduces the paths that the regex processor has to check.
    To see these in action, you can use Perl's regex debugger.
    You will want to redirect STDERR to a file (perlprogram 2>file).
    use re qw(debug); $_ = " TCP 192.168.101.2:1519 192.168.101.1:22 ESTABLISHED\n"; /^\s+(.*)\s+(.*):(.*)\s+(.*):(.*)\s+(.*)/;
    35664 lines of output.
    use re qw(debug); $_ = " TCP 192.168.101.2:1519 192.168.101.1:22 ESTABLISHED\n"; /^\s+(\S+)\s+(\S+):(\S+)\s+(\S+):(\S+)\s+(\S*)/;
    120 lines of output.

    Other comments:

    1. BEGIN block is not needed. Move the code into the main body of your script.
    2. When debugging non-CGI problems in a CGI program, copy the code to a non-CGI test script.
    3. Don't backwhack ' inside "".
    4. foreach $line (@lines){$_ = $line; ...} is confusing and redundant.
      Say foreach (@lines){...} instead. $_ is assigned to by default.
    5. Try #!/usr/bin/perl -w as the first line of your scripts. It will probably work (it does for all my Win32 boxes), and you will be happier when writing cross-platform code in the future.

    FWIW, here is how I would do it. Tested on Windows 2000, ActivePerl 631.

    #!/usr/bin/perl -W use strict; use warnings 'all'; # Shorten pattern. # Remote IP addresses and ports can be '*'. my $addr = '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'; my $port = '\d{1,4}'; foreach (`netstat -an`) { next if /^Active Connections/; next if /^$/; next if /^\s+Proto/; my ($prot,$laddr,$lport,$eaddr,$eport,$status) = / ^ # Force start of string \s* # Optional leading white space. (TCP|UDP) # Prot is TCP or UDP \s+ # Required whitespace ($addr) # Local address : # Seperated by colon ($port) # Local port \s+ # Required whitespace (\*|$addr) # Remote address : # Seperated by colon (\*|$port) # Remote port \s+ # Win2K has whitespace, even when next parm is blank (\w+)? # Optional State \s* # Optional trailing whitespace $ # Force end of string /xo # 'o' to stop pattern from recompiling or next; # Change to 'warn' while testing regex. my $syn = 1 if $status =~ /syn/i; print "\nwarning: $status! I think we're being SYN'ed\n\n" if $syn; print "Local: $laddr:$lport - External: $eaddr:$eport - $status\n"; }