in reply to Bot vs human User Agent strings

It's pretty easy if you don't use GA but instead use your own logging. You can configure the webserver only to log requests from user agents you are interested in (or conversely omit those you are not). You can do this based on a regexp, see eg. BrowserMatch for Apache.

You can use any number of these to mark the user agents as you so wish and then log only the unmarked ones. This also has the bonus effect that you will still see requests even from users (such as I) who block GA.


🦛

Replies are listed 'Best First'.
Re^2: Bot vs human User Agent strings
by Bod (Parson) on Feb 09, 2024 at 22:15 UTC
    You can configure the webserver only to log requests from user agents you are interested in

    Oh yes - I'd overlooked doing this in Apache. Thanks for the suggestion.

    My first thought is that I'd still rather record the information in a DB table, as we want to be able to query it. A DB query will be much easier than parsing the Apache logs.

      Sure, but once you have the data in whatever form it shouldn't be too hard to import it into the DB of your choice. Remember that you can also set Apache to log to a pipe so you could load it into a DB without going to the local filesystem in the first place. Lots of options. :-)


      🦛