concept has asked for the wisdom of the Perl Monks concerning the following question:

Hello all!

I am new to perl (started about a week ago) and have already found a new home. Now I have a bit of a problem, that I cannot really figure out how to do it. It feels like it is on the tip of my toungue, but I just can't get it.

I am trying to parse a logfile and create a report from it. I am not sure how to do this. Regexp to match the lines?

Here is a line from the log:
2012-06-01 16:04:46,395 Thread-12796 - Dispatcher thread #1 INFO JMasterProtocol - JMasterProtocol.getMasterPushExitInstance: Asked exit instance for target 'SPF0055R' from dispatcher

First I need to search for the SPF0055R to get the Tread number. Then put the thread number into a variable and then search the file again for the thread number and insert other variables, but only from lines that has the same thread number.

So far all I have is:
open (FILEPTR,$file) || die "Cant open file"; @trace <FILEPTR>;
After that I am not sure what to do? Can someone please point me in the right direction?

Replies are listed 'Best First'.
Re: Parsing a logfile.
by GrandFather (Saint) on Jun 02, 2012 at 22:14 UTC

    The first thing you need to do is go out and get yourself a copy of the Lama ("Learning Perl") and read it, or read through some of the Perl documentation such as perlintro. You will also find a pile of good stuff in the Tutorials section.

    Next, always start your scripts with strictures:

    use strict; use warnings;

    That will help pick up silly errors and typos. Often the errors won't make sense initially so you could throw use diagnostics; into the mix as well, or ask here. Your given code would be better written:

    use strict; use warnings; use diagnostics; my $logName = 'logfile.txt'; open my $logIn, '<', $logName or die "Can't open $logName: $!\n"; while (defined (my $line = <$logIn>)) { ...; }

    which uses strictures, the safer three parameter version of open, safer and better behaved lexical file handles, better failure diagnostics and a while loop instead of slurping the file.

    True laziness is hard work
Re: Parsing a logfile.
by aaron_baugher (Curate) on Jun 02, 2012 at 20:56 UTC

    First of all, don't load an entire file into an array unless you have a good reason. (There may be a good reason in this case, but I can't tell for sure.) Instead, read and process it line-by-line. The usual method is:

    open my $fd, '<', $file or die "Can't open file: $!"; while(my $line = <$fd>){ # do stuff with line in $line }

    Now, when you're parsing stuff out of a line, there are two common methods. If you have a bunch of ordered fields separated by a known delimiter, you can split the line on that delimiter. (An aside: sometimes in these cases a CSV module is very useful.) On the other hand, if you are picking out particular strings and data that happens to be near/between them, a regex may be your best bet. (Looking at your data, I suspect a regex wins here.) Examples of the two:

    my @fields = split /\t/, $line; # split the line on tabs # or my( $threadnum ) = $line =~ /Thread-(\d+)/; # grab the digits followin +g Thread-

    That should get you started on parsing out what you want. For help with the rest of what you talked about, it'd help to see before-and-after examples of what you're trying to do.

    Aaron B.
    Available for small or large Perl jobs; see my home node.

Re: Parsing a logfile.
by thomas895 (Deacon) on Jun 02, 2012 at 19:29 UTC

    First of all, line 2 of your code snippet must be:

    @trace = <FILEPTR>;

    In order for us to be able to give you any useful info, you will have to give us some more info. What is the desired result format?
    If you want to take an easier way, I suggest looking at the CPAN main page. From there, try one of the links that you think will help you. After that, it's modules and documentation galore, and I'm sure there's something there that will help you.

    Good luck! :-)

    ~Thomas~
    bless( $you ) if $you->{sneezed};
Re: Parsing a logfile.
by NetWallah (Canon) on Jun 03, 2012 at 00:05 UTC
    Your first regex and Hash usage may be a challange, so here is a complete program, based on a log that looks like this:
    2012-06-01 16:04:46,395 <a href="?node=Thread-12796%20-%20Dispatcher%2 +0thread%20%231">Thread-12796 - Dispatcher thread #1</a> INFO JMaster +Protocol - JMasterProtocol.getMasterPushExitInstance: Asked exit ins +tance for target 'SPF293R' from dispatcher
    use strict; use warnings; use diagnostics; my $searchfor = shift @ARGV or die "Please specify the search target a +s an argument"; my (%TargetList, $linecount, @foundThreads); my $logName = 'test-logfile.txt'; open my $logIn, '<', $logName or die "Can't open $logName: $!\n"; while (defined (my $line = <$logIn>)) { $linecount++; my ($thread, $target) = $line=~m/Thread-(\d+).+target\s*'([^']+)'/ +or next; push @{ $TargetList{$thread} }, $target; if ($target eq $searchfor){ push @foundThreads, $thread; } } close $logIn; if (scalar(@foundThreads)>0){ print "Target '$searchfor' is in threads " . join(",",@foundThreads), "\n"; print "Targets in these threads are:\n"; for my $t(sort @foundThreads){ print " $t: $_\n" for @{$TargetList{$t}}; } }else{ print "*** Target '$searchfor' was not found in log $logName\n"; } print "Totals: $linecount lines, ",scalar(keys %TargetList)," Threads. +\n";
    UpdateRe-read the specifications, and now display context is slightly different.

                 I hope life isn't a big joke, because I don't get it.
                       -SNL