dlcasey has asked for the wisdom of the Perl Monks concerning the following question:

H, Perl newb here.... I have an application log I'm trying to retrieve a specific sequence of 4 lines
from(event1, event2,event3, event4 - discarding any duplicates in between).
I've already filtered the log to get all occurances the four lines I want
but I don't know how to pull out the four lines when they happen sequentially.
EX:

09:12:50:861 EVENT1 #Don't want this line
09:13:09:467 EVENT1 #Don't want this line
09:13:09:837 EVENT1
09:13:38:059 EVENT2
09:14:03:115 EVENT3
09:14:04:076 EVENT4
09:14:11:376 EVENT1
09:14:34:049 EVENT2
09:14:34:990 EVENT3
09:14:34:990 EVENT3 #Don't want this line
09:14:34:990 EVENT4

I then need to do a time calculation between each of the four events (I can do this part....I just can't
figure out how to pull out the four lines I want everytime they appear in the sequential order I am looking for)
Can anyone help?
Thanks!

Original content restored by GrandFather. Replaced content was duplicate of Log Parsing Help

Replies are listed 'Best First'.
Re: Parsing Pattern Question
by ikegami (Patriarch) on Sep 03, 2009 at 14:34 UTC
    Parent post as it stood when I replied
    H, Perl newb here.... I have an application log I'm trying to retrieve a specific sequence of 4 lines
    from(event1, event2,event3, event4 - discarding any duplicates in between).
    I've already filtered the log to get all occurances the four lines I want
    but I don't know how to pull out the four lines when they happen sequentially.
    EX:

    09:12:50:861 EVENT1 #Don't want this line
    09:13:09:467 EVENT1 #Don't want this line
    09:13:09:837 EVENT1
    09:13:38:059 EVENT2
    09:14:03:115 EVENT3
    09:14:04:076 EVENT4
    09:14:11:376 EVENT1
    09:14:34:049 EVENT2
    09:14:34:990 EVENT3
    09:14:34:990 EVENT3 #Don't want this line
    09:14:34:990 EVENT4

    I then need to do a time calculation between each of the four events (I can do this part....I just can't
    figure out how to pull out the four lines I want everytime they appear in the sequential order I am looking for)
    Can anyone help?
    Thanks!
    my $last; my @history; my $expect = 1; while (<>) { chomp; my ($num) = /EVENT(\d+)/ or next; # Ignore bad input. next if defined($last) && $last eq $_; $last = $_; if ($num == 1 || $num != $expect) { @history = (); $expect = 1; } if ($num == $expect) { push @history, "$_\n"; if ($expect++ == 4) { print(@history); @history = (); $expect = 1; } } }

    Update: Handle duplicates as requested.

Re: Parsing Pattern Question
by BrowserUk (Patriarch) on Sep 03, 2009 at 14:02 UTC

    Your spec is inconsistant.

    • In the first instance you discard 'EVENT1's (bar the last), until you see an 'EVENT2'.
    • In the second, you keep the first instance of 'EVENT3', and discard dups until you get an 'EVENT4'.

    I can see one rule that might explain that process, but better you explain when to discard or retain an earlier (near)duplicate than have me (us) guess.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parsing Pattern Question
by arun_kom (Monk) on Sep 03, 2009 at 15:54 UTC

    Assuming that the entries in the log file are ordered by time which is the case in your test data and that you only want to keep the last entry of a particular event, the following would work.

    #!/usr/bin/perl -w use strict; my %log; foreach(<DATA>) { chomp; if(!/^$/){ /.+(EVENT\d)/; $log{$1} = $_; } } print map { "$log{$_}\n" } sort keys %log; __DATA__ 09:12:50:861 EVENT1 #Don't want this line 09:13:09:467 EVENT1 #Don't want this line 09:13:09:837 EVENT1 09:13:38:059 EVENT2 09:14:03:115 EVENT3 09:14:04:076 EVENT4 09:14:11:376 EVENT1 09:14:34:049 EVENT2 09:14:34:990 EVENT3 #Don't want this line 09:14:34:990 EVENT3 09:14:34:990 EVENT4
Re: Parsing Pattern Question
by ig (Vicar) on Sep 03, 2009 at 19:22 UTC

    Here's another way.

    use strict; use warnings; my ($lastline, $lastevent); while (<DATA>) { chomp; next unless(/EVENT(\d)/); print "$lastline\n" if(defined($lastevent) and $1 ne $lastevent); $lastevent = $1; $lastline = $_; } print "$lastline\n"; __DATA__ 09:12:50:861 EVENT1 #Don't want this line 09:13:09:467 EVENT1 #Don't want this line 09:13:09:837 EVENT1 09:13:38:059 EVENT2 09:14:03:115 EVENT3 09:14:04:076 EVENT4 09:14:11:376 EVENT1 09:14:34:049 EVENT2 09:14:34:990 EVENT3 #Don't want this line 09:14:34:990 EVENT3 09:14:34:990 EVENT4
Re: Parsing Pattern Question
by bichonfrise74 (Vicar) on Sep 03, 2009 at 20:15 UTC
    Another way...
    #!/usr/bin/perl use strict; my %record; my $previous_event; my $max_event = 4; my $key = 1; while (my $line = <DATA>) { chomp( $line ); my ($current_event) = $line =~ /EVENT(\d)/; $record{$key}->{$current_event} = $line; $key++ if ( $current_event == 4 && $previous_event != $current_event ); $previous_event = $current_event; } for my $i (sort keys %record) { print map { $record{$i}->{$_} . "\n" } sort keys %{ $record{$i} }; } __DATA__ 09:12:50:861 EVENT1 09:13:09:467 EVENT1 09:13:09:837 EVENT1 09:13:38:059 EVENT2 09:14:03:115 EVENT3 09:14:04:076 EVENT4 09:14:11:376 EVENT1 09:14:34:049 EVENT2 09:14:34:990 EVENT3 09:14:34:990 EVENT3 09:14:34:990 EVENT4
Re: Parsing Pattern Question
by ambrus (Abbot) on Sep 04, 2009 at 10:17 UTC

    uniq -f1 -w7 will discard duplicate events but keep the first one of each chunk.

    See Re^2: Joining two files on common field for a list of other nodes where unix textutils is suggested to merge files.