Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^3: how to resolve IP's in an HTTPd that doesn't resolve them?

by afoken (Chancellor)
on Jun 13, 2018 at 20:58 UTC ( [id://1216592] : note . print w/replies, xml ) Need Help??


in reply to Re^2: how to resolve IP's in an HTTPd that doesn't resolve them?
in thread how to resolve IP's in an HTTPd that doesn't resolve them?

I could simply:

#!/bin/sh - cat /var/log/my.web.host-access-log | awk '{ print $1; }' | ...

or some such to feed the logs to a resolver. But I'm ideally looking for a way to process the log(s) (connections) in "real time". So that the logs have the correct access times. I can imagine filtering , or piping it.

<beancounting>You don't need cat in that pipe, just let awk read directly from the logfile.</beancounting>

Back on topic: Name resolving takes time, causes some extra load, and can fail. Hence web servers generally prefer not to resolve the remote address for performance reasons. However, you could simply log to a pipe instead of logging into a file. Apache comes with logresolve, which is intended to run offline, but you could also use it "live". It's a simple filter. It might be a little bit too simple-minded:

To minimize impact on your nameserver, logresolve has its very own internal hash-table cache. This means that each IP number will only be looked up the first time it is found in the log file.

In other words: logresolve completely ignores any TTLs and so your live log will contain nonsense after running for a while. It's not a bug, as logresolve is intended to run offline and only for a short time.

Have a look at the daemontools. At least multilog is usable, it takes care of reliably logging, including rotating log files. There is no IP resolving program in daemontools, but djb also published djbdns, a modular DNS resolver. It contains dnsfilter that should do quite exactly what you want: Resolve an IP address at line start to a host name. You should perhaps install a local cache on the webserver. That way, DNS requests are cached by djb's dnscache, dnsfilter reads most responses from the local cache, and so, DNS requests become a lot less expensive.

To recap: Install a local DNS cache. Then log to a pipe that writes into dnsfilter. dnsfilter then logs into multilog, which creates a nice set of log files.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^4: how to resolve IP's in an HTTPd that doesn't resolve them?
by taint (Chaplain) on Jun 13, 2018 at 21:54 UTC
    Thanks for your elaborate reply, afoken !

    Timely (HOST) resolution is not a problem on my servers. In fact I wrote (finally finished) a little resolver in about 160 lines ( C source ). That'll turn a file of 255 IP addresses into HOTS name(s), in under ~1 second, on standard CPE. Even faster if given a fatter "pipe". It does so accurately. It is slower, of course, on slower connections, or on bad / unmanaged addresses. Tho I could add a time threshold to the resolver. I haven't bothered, as I only use it for post-processing.

    So, it would seem from your response; that you'd recommend using a pipe. If my intent is to process the (connecting) IP addresses in real-time. While I had hoped to avoid that. I guess I'm not terribly surprised.

    Speaking of the Apache HTTPd; it's interesting that Apache doesn't have, or choose the use of a pipe. As it happily logs resolved IP addresses to it's log(s), from all my experiences with it.

    Maybe I'd do well to give it's source a look over. For possible clues.

    Thanks again, afoken, for taking the time to reply!

    Evil is good, for without it, Good would have no value
    λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH

      it's interesting that Apache doesn't have, or choose the use of a pipe

      I expect Apache to simply open the log file in append mode. That should also work with a named pipe (a.k.a. FIFO). mknod /var/log/httpd/access.log p should be sufficient. Apache writes to that pipe, and a resolver program reads from the pipe.

      But Apache can do even better, see piped logs:

      CustomLog "|/usr/local/bin/name-resolver foo bar baz" common

      The shell can also be invoked, that should allow creating a second pipe for a rotating logger:

      CustomLog "|$/usr/local/bin/name-resolver foo bar | /usr/local/bin/mul +tilog t s1000000 /var/multilog/apache" common

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
        Thanks afoken !

        That's all pretty much as I had imagined the direction I'd need to go. But had hoped for something a little more fun, or elegant.
        Tho your suggestion is elegant in it's simplicity. Which is worth quite a bit, in my book. :-)

        Thanks again, Alexander!

        edit:
        Forgot to mention; yes. You are correct. As near as I can figure, Apache does write in append mode.

        edit II:
        Oh, and this ain't Apache I'm working with. But logging to a UNIX pipe is still valid. :-)
        and sorry for the additional edits. But I had a lot on my plate this AM, and I was a bit pressed for time. But by the same token, didn't want to let all the work go unacknowledged. :-)

        --Chris

        Evil is good, for without it, Good would have no value
        λɐp ʇɑəɹ⅁ ɐ əʌɐɥ puɐ ʻꜱdləɥ ꜱᴉɥʇ ədoH