Re^7: Log Parsing

To get multiple records for a child runID use a hash of arrays (HoA)

#!/usr/bin/perl
use strict;
#use Data::Dump 'pp';

my %data = ();
my @id   = ();
my $reqid = '8274';

my $infile  = 'worker.log';
my $outfile = 'data.CSV';
open IN,'<',$infile or die "$!";

# input
while (<IN>) {
  chomp;
  next unless /Task:id=(\d+)/;

  my $taskid = $1;
  my (undef,$timestamp,undef)  = split /\s+/,$_,3;
  
  if (/(Started|Done).+topic (\d+_(\d+)_\d+_IN_(\d).*)/){
    #print "$1 $2 $3 $4\n";
    $data{$3}[$4]{$1}      = $timestamp;
    $data{$3}[$4]{'Topic'} = $2;
  }

  while ( /\{(\d+)-SUCCESSFUL\}/g ){
    push @id,$1 if ($taskid eq $reqid);
  }
}
close IN;
#pp \@id;
#pp \%data;

# output
open OUT,'>',$outfile or die "$!";
my @cols = qw(Topic Started Done);
printf OUT "%s,%s,%s\n",@cols;
for my $id (sort @id){
  if (exists $data{$id}){
    for my $rec ( @{$data{$id}} ){
      printf OUT "%s,%s,%s\n", map { $rec->{$_} } @cols;
    }
  }else {
    print OUT "$id- no data\n";
  }
}
close OUT;
[download]

poj

Comment on Re^7: Log Parsing Download Code

Replies are listed 'Best First'.
Re^8: Log Parsing by piyushmnnit06 (Novice) on Apr 03, 2017 at 11:02 UTC
Thanks a lot for such quick responses.it is working better than my expectation .	[reply]
Re^9: Log Parsing by piyushmnnit06 (Novice) on May 10, 2017 at 11:54 UTC
I have become a bit more greedy if you can help that would great. I din't created separate thread as I though that would be again re-work there is some addition in log structure and one more separate script is needed for that 0317 09:53:14.865+0000 {12772} INFO [pm-worker-exec slot-Task:id=8274 +,env=12772,type=11][c.s.w.t.f.s.PostExecutionStage ] Loaded {child r +unId vs completion type}: {10418-SUCCESSFUL}{10419-SUCCESSFUL}{8288-S +UC +CESSFUL}{8289-SUCCESSFUL}{8290-SUCCESSFUL}{8291-SUCCESSFUL}{8292-SUCC +ESSFUL}{8293-SUCCESSFUL}{8294-SUCCESSFUL}{8295-SUCCESSFUL}{8296-SUCCE +SSFUL} 0427 10:04:55.735+0000 {12772} INFO [m-worker-exec slot-Task:id=10418 +,env=12772,type=55][c.s.c.d.i.OutputDataExtractor ] Started consumi +ng output for identifier: PRIMARY_DATASET 0427 10:04:56.040+0000 {12772} INFO [m-worker-exec slot-Task:id=10418 +,env=12772,type=55][c.s.c.d.i.OutputDataExtractor ] Done consuming +output for identifier: PRIMARY_DATASET 0427 10:04:50.656+0000 {12772} INFO [m-worker-exec slot-Task:id=10418 +,env=12772,type=55][c.s.c.d.jobs.JobServerJobManager] Submitted job w +ith Job id 85c6b503-6181-4744-815c-03a1febbbc7b 0427 10:04:55.734+0000 {12772} INFO [m-worker-exec slot-Task:id=10418 +,env=12772,type=55][c.s.c.d.jobs.JobServerJobManager] Done killing jo +b with id 85c6b503-6181-4744-815c-03a1febbbc7b [download] Now output is required `Task id,start time,endtime 10418 consuming,10:04:55.735+0000,10:04:56.040+0000 10418 Submitted,0427 10:04:50.656+0000,0427 10:04:55.734+0000` [download]	[reply] [d/l] [select]
Re^10: Log Parsing by poj (Abbot) on May 10, 2017 at 12:32 UTC
Try #!/usr/bin/perl use strict; my $infile = 'worker.log'; open IN,'<',$infile or die "$!"; # input my %data = (); while (<IN>) { chomp; next unless /Task:id=(\d+)/; my $taskid = $1; my (undef,$timestamp,undef) = split /\s+/,$_,3; if (/(Started\|Submitted\|Done).*(consuming\|job)/){ my $i = ($1 eq 'Done') ? 1 : 0; my $proc = ($2 eq 'consuming') ? $2 : 'Submitted'; $data{$taskid.' '.$proc}[$i] = $timestamp; } } close IN; # output print join ',',('Task id','Start time',"End time\n"); for my $id (reverse sort keys %data){ printf "%s,%s,%s\n",$id,@{$data{$id}}; } [download] poj	[reply] [d/l]
Re^11: Log Parsing by piyushmnnit06 (Novice) on May 11, 2017 at 07:54 UTC
Re^12: Log Parsing by poj (Abbot) on May 11, 2017 at 09:01 UTC
Some notes below your chosen depth have not been shown here