ddeyoung has asked for the wisdom of the Perl Monks concerning the following question:

This works fine...
print $name = gethostbyaddr(192.168.1.1, AF_INET);

This does not...
while (defined($line = <INPUT>)) { if ($line =~ /\b\d*\.\d*\.\d*\.\d*\b/) { $name = gethostbyaddr($&, 'tcp'); if (defined($name)) { print "hostname for address $&: $name\n"; }else{ print "Couldn't get hostname for: $&\n"; } } }

The value of $& appears to be the correct ip address, but for some reason gethostbyaddr() isn't returning a defined value. My guess is that the value in $& isn't formatted correctly for the function to correctly pass it.

What am I missing (besided pretty code)?

Replies are listed 'Best First'.
Re: novice falls
by kjherron (Pilgrim) on Oct 11, 2001 at 04:02 UTC
    Let's start with this:
    gethostbyaddr(192.168.1.1, AF_INET);
    The fact that this works is sort of an accident. You're "supposed" to do something more like this:
    $ip = inet_aton('192.168.1.1'); print gethostbyaddr($ip, AF_INET);
    Your example appears to work as a side effect of a new string syntax intended to support unicode (and software version numbers like "5.6.0"). It does the right thing in your example but it doesn't appear to have been designed with IP addresses in mind.

    Okay, but why does it fail in your second block of code? When perl is compiling your code and encounters "192.168.1.1" as a bare word, it packs the four bytes 192, 168, 1, and 1 into a four-characters string (this happens to be the same thing that inet_aton() returns, and that gethostbyaddr() is expecting to receive). This conversion process is part of the perl compilation process, i.e. it happens it compile time.

    In your second example, you're asking for the same thing to happen at runtime. This isn't going to work. gethostbyaddr() is expecting to receive a four-byte string containing a packed IP address; instead it gets an 11-byte string containing a literal "192.168.1.1" (or similar).

    Further, in your second example, you're using 'tcp' as the second argument to gethostbyaddr(). This isn't correct. You should supply AF_INET as you did in your first example.

    In your second example, try replacing the gethostbyaddr() line with the following:

    my $ip = inet_aton($&); my $name = gethostbyaddr($ip, AF_INET);
Re: novice falls
by wog (Curate) on Oct 11, 2001 at 03:41 UTC
    You have to problems with your arguments passed to gethostbyaddr. First, you are correct that $& is not in the correct format. You must pass it in packed address format, which can be done using Socket's inet_aton to convert the address from readable dotted form to network-ordered bytes. (Perl's "version strings" appear to generate a similarlly formatted string.) Secondly, the second arg to gethostbyaddr needs to be a valid type of address, which would be AF_INET, probably. You do not want 'tcp', because that is a protocol, not a type of address. (In fact, TCP can and does have different types of addresses, with IPv6.)

    Note that it is a good idea to use capturing parenthesis instead of $&, because capturing parenthesis tend to make your program faster. (See also perlvar.)

      Actually, with recent perls $& isn't too bad ($' and $` still garner a performance hit). See perldoc -q 'slow my program' for more details. (Not that you shouldn't use capturing parens though; just clearing up an old meme).

Re: novice falls
by dws (Chancellor) on Oct 11, 2001 at 04:04 UTC
    The regexp you're using to pick off the IP address is going to give you false positives on garbage input. As written, it will match on
    ... 999.999.999.999
    and a few other bogus IPs. There are regexps that will correctly match only valid IP addresses. Some are even findable via Super Search. Try searching for titles that contain "match" and "IP".

      Hopefully if you're parsing a log file you won't get this sort of garbage input, but it is better to be safe than sorry. Instead of writing the regexp to end all regexps, which many have made a noble effort of doing, maybe using this admittedly slack one with some post-processing is the best bet.
      use Socket; # ... if ($line =~ /\b(\d{1,3}(?:\d{1,3}){3}\b/) { if (my $in_addr = inet_aton($1)) { if ($name = gethostbyname($in_addr, AF_INET)) { # ... } } }
      inet_aton will reject any bad input, such as 999.999.999.999, meaning that gethostbyname will not even try to resolve these.

      As a side-note, gethostbyname can take a very, very long time to return input if the name-servers for that address block are down. If you are trying to bulk-resolve, you might want to use a tool which does this more efficiently, such as dnsfilter from the DJB DNS package. A Perl solution is also available from CPAN.
Re: novice falls
by dws (Chancellor) on Oct 11, 2001 at 08:47 UTC
    One more thing: If the data that you're processing contains duplicate IP addresses, you'll save redundant calls to gethostbyaddr by caching results.
    $hostcache{$&} ||= gethostbyaddr($&, 'tcp'); if ( defined $hostcache{$&} ) { ...