namishtiwari has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a log file in which there are plenty of threads. i want to count the number of unique threads.Here thread is tid value. My Logfile looks like this--
Wed May 20 05:22:53.993 2009 Morocco Standard Time INFO: pid 2172 t +id 688: 17: 10106931: ArAuthFrameworkImpl::doPreAuth::1:10106931:: Au +thentication mechanism returned [0] for AuthIdentity [] Wed May 20 05:22:53.993 2009 Morocco Standard Time INFO: pid 2172 t +id 688: 170: 10106931: Arcot Native Server: recvd AA_BIN_MSG_VER_CHG Wed May 20 05:22:57.634 2009 Morocco Standard Time INFO: pid 2172 t +id 3352: 170: 10106932: Arcot Native Server: recvd AA_BIN_MSG_GET_WLT Wed May 20 05:22:57.634 2009 Morocco Standard Time INFO: pid 2172 t +id 3352: 170: 10106932: Session tracker Id associated with generate c +hallenge[1:10106932]
Like this there are thousands of threads in a log file. i want to capture the unique threads and count the numbers. Any kind of help or suggestion will be very useful to me. Thanks NT

Replies are listed 'Best First'.
Re: thread count
by almut (Canon) on Jun 10, 2009 at 11:11 UTC

    Create a unique key from the tid/pid by extracting those values from the log file (e.g. with a regex match), and then use a hash to count those keys' occurrences: $count{$key}++; (by iterating over the lines of the file)

      what i have tried is this--
      #!/usr/bin/perl -w print "Hello, World...\n"; my $logFile = $ARGV[0]; die "usage: $0 <logFile>" unless $logFile; die "Logfile $logFile doesn't exist" unless -f "$logFile"; open(my $log, "<", $logFile) or die "Can't open $logFile for reading." +; print "Processing file $logFile...\n"; #my $authenticates = {}; my $n = 0; my $TIDCount = 0; while(my $line = <$log>) { # Outer loop. Look for an interesting part of the log file. $n++; $line =~ tr/\r\n//d; if($line =~ /tid (\d+)/){ $TIDCount++; next; } } print "Thread count for the logfile is $TIDCount\n";
      but it is giving me the whole count including the repated ones, but i just want the count of unique threads. Thanks NT
        Did you read almut's post. There is a big clue in it, if you capture the digits after 'tld ', which you do, you can assign them to a value, if you put the scalar into a list context with brackets
        my $substring; my $string="For those about to rock"; ($substring)=$string=~/^.+?to (\w+)/; print "So, we're going to $substring then?\n";
        using this type of structure you can identify specific threads, then if you increment $threads{$substring} you have an array  keys %threads which contains the thread id of each thread, you even have a count of how busy they are in the logfiles ;)