You can use a hash where the keys are the IPs and the values are the count:

use strict; use warnings; use Regexp::Common qw/net/; my %hash; while (<DATA>) { $hash{$1}++ if /($RE{net}{IPv4})/; } print "$_ => $hash{$_}\n" for keys %hash; my $uniqueIPs = keys %hash; print "Number of unique IPs: $uniqueIPs"; __DATA__ 127.0.0.1 - - [10/Apr/2007:10:39:11 +0300] "GET / HTTP/1.1" 500 606 "- +" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 + Firefox/2.0.0.3 (Ubuntu-feisty)" 127.0.0.1 - - [10/Apr/2007:10:39:11 +0300] "GET /favicon.ico HTTP/1.1" + 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gec +ko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET / HTTP/1.1" 500 612 " +-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/2006120 +1 Firefox/2.0.0.3 (Ubuntu-feisty)" 139.12.0.2 - - [10/Apr/2007:10:40:54 +0300] "GET /favicon.ico HTTP/1.1 +" 200 766 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Ge +cko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 127.0.0.1 - - [10/Apr/2007:10:53:10 +0300] "GET / HTTP/1.1" 500 612 "- +" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 + Firefox/2.0.0.3 (Ubuntu-feisty)" 127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET / HTTP/1.0" 200 3700 " +-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/2006120 +1 Firefox/2.0.0.3 (Ubuntu-feisty)" 127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET /style.css HTTP/1.1" 2 +00 614 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i686; en-US; r +v:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET /img/pti-round.jpg HTT +P/1.1" 200 17524 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i686 +; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 127.0.0.1 - - [10/Apr/2007:10:54:21 +0300] "GET /unix_sysadmin.html HT +TP/1.1" 200 3880 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i686 +; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 217.0.22.3 - - [10/Apr/2007:10:54:51 +0300] "GET / HTTP/1.1" 200 34 "- +" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/20061201 + Firefox/2.0.0.3 (Ubuntu-feisty)" 217.0.22.3 - - [10/Apr/2007:10:54:51 +0300] "GET /favicon.ico HTTP/1.1 +" 200 11514 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) +Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 217.0.22.3 - - [10/Apr/2007:10:54:53 +0300] "GET /cgi/pti.pl HTTP/1.1" + 500 617 "http:/contact.local/" "Mozilla/5.0 (X11; U; Linux i686; en- +US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 127.0.0.1 - - [10/Apr/2007:10:54:08 +0300] "GET / HTTP/0.9" 200 3700 " +-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/2006120 +1 Firefox/2.0.0.3 (Ubuntu-feisty)" 217.0.22.3 - - [10/Apr/2007:10:58:27 +0300] "GET / HTTP/1.1" 200 3700 +"-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/200612 +01 Firefox/2.0.0.3 (Ubuntu-feisty)" 217.0.22.3 - - [10/Apr/2007:10:58:34 +0300] "GET /unix_sysadmin.html H +TTP/1.1" 200 3880 "http://pti.local/" "Mozilla/5.0 (X11; U; Linux i68 +6; en-US; rv:1.8.1.3) Gecko/20061201 Firefox/2.0.0.3 (Ubuntu-feisty)" 217.0.22.3 - - [10/Apr/2007:10:58:45 +0300] "GET /talks/Fundamentals/r +ead-excel-file.html HTTP/1.1" 404 311 "http://pti.local/unix_sysadmin +.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.3) Gecko/200 +61201 Firefox/2.0.0.3 (Ubuntu-feisty)"

Output:

127.0.0.1 => 8 139.12.0.2 => 2 217.0.22.3 => 6 Number of unique IPs: 3

If you're not interested in the count for each IP, you can just use $hash{$1} = 1 instead of incrementing. Either way, the keys will contain log file's unique IPs. Regexp::Common is used in capturing the IPs.


In reply to Re: unique visitors from html logfile by Kenosis
in thread unique visitors from html logfile by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.