in reply to How do I read a log file that contents recurring log messages those are separated by newline characters?

Store your messages as keys in a hash, and increment the count.
use strict; use warnings; my %msgs; while (<DATA>) { s/^\s+//; chomp; $msgs{$_}++; } # Sort by number of occurrences and only show top 8: my $i = 0; for my $m (sort {$msgs{$b} <=> $msgs{$a}} keys %msgs) { print "$msgs{$m} $m\n"; $i++; last if $i == 8; } __DATA__ Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol unlock_ +page Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol generic +_file_read Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol generic +_file_write Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol generic +_file_mmap Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol generic +_file_sendfile Mar 9 08:15:05 gen-vcs11 kernel: kjslah: disagrees about versio +n of symbol zone_table Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol zone_ta +ble Mar 9 08:15:05 gen-vcs11 kernel: kjslahdisagrees about version +of symbol unlock_page Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol unlock_page Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol filemap +_fdatawrite Mar 9 08:15:05 gen-vcs11 kernel: kjslah: Unknown symbol find_or +_create_page

See also:

perlintro

perldoc -q sort

  • Comment on Re: How do I read a log file that contents recurring log messages those are separated by newline characters?
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: How do I read a log file that contents recurring log messages those are separated by newline characters?
by TomDLux (Vicar) on Oct 15, 2010 at 14:09 UTC

    You are including the time stamps in the key, so identical events a second apart increment separate counts. or maybe you were showing the overall concept, and leaving the trimming as an exercise for the student?

    It looks like gen-vcs11 kernel is a standard component of every line, so I would ignore it. Using split to extract the second and third components, and using those as a key:

    my ( $code, $msg ) = ( split, ':', $_)[2,3]; $msgs{$code}{$msg}++;

    It becomes even simplar if you only want to preserve the msg component.

    As Occam said: Entia non sunt multiplicanda praeter necessitatem.

      or maybe you were showing the overall concept, and leaving the trimming as an exercise for the student?
      Yes.