in reply to Re^3: Log Parsing
in thread Log Parsing

Thanks I tried if (/(Started|Done).+topic (\d+_(\d+)_\d+_IN_[0-1])/) but it is working either for 0 or for 1 but I want it to work with both.

Replies are listed 'Best First'.
Re^5: Log Parsing
by poj (Abbot) on Apr 03, 2017 at 06:58 UTC

    Show an example of the line that it is not working for

      0330 07:36:31.923+0000 {12772} INFO [m-worker-exec slot-Task:id=10030 +,env=12772,type=11][c.s.w.t.f.s.PostExecutionStage ] Loaded {child r +unId vs completion type}: {10034-SUCCESSFUL}{10035-SUCCESSFUL}{10036- +SUCCESSFUL} 0330 07:10:01.366+0000 {12772} INFO [m-worker-exec slot-Task:id=10034 +,env=12772,type=55][edProcessInputBatchKafkaProducer] Started produci +ng records on topic 12772_10034_20170330_IN_0 0330 07:17:42.473+0000 {12772} INFO [m-worker-exec slot-Task:id=10034 +,env=12772,type=55][edProcessInputBatchKafkaProducer] Done with produ +cing records on topic 12772_10034_20170330_IN_0 - produced 10000000 r +ecords 0330 07:17:42.480+0000 {12772} INFO [m-worker-exec slot-Task:id=10034 +,env=12772,type=55][edProcessInputBatchKafkaProducer] Started produci +ng records on topic 12772_10034_20170330_IN_1 0330 07:17:45.033+0000 {12772} INFO [m-worker-exec slot-Task:id=10034 +,env=12772,type=55][edProcessInputBatchKafkaProducer] Done with produ +cing records on topic 12772_10034_20170330_IN_1 - produced 100000 rec +ords
      it is not giving time for both 12772_10034_20170330_IN_0,12772_10034_20170330_IN_1

        To get multiple records for a child runID use a hash of arrays (HoA)

        #!/usr/bin/perl use strict; #use Data::Dump 'pp'; my %data = (); my @id = (); my $reqid = '8274'; my $infile = 'worker.log'; my $outfile = 'data.CSV'; open IN,'<',$infile or die "$!"; # input while (<IN>) { chomp; next unless /Task:id=(\d+)/; my $taskid = $1; my (undef,$timestamp,undef) = split /\s+/,$_,3; if (/(Started|Done).+topic (\d+_(\d+)_\d+_IN_(\d).*)/){ #print "$1 $2 $3 $4\n"; $data{$3}[$4]{$1} = $timestamp; $data{$3}[$4]{'Topic'} = $2; } while ( /\{(\d+)-SUCCESSFUL\}/g ){ push @id,$1 if ($taskid eq $reqid); } } close IN; #pp \@id; #pp \%data; # output open OUT,'>',$outfile or die "$!"; my @cols = qw(Topic Started Done); printf OUT "%s,%s,%s\n",@cols; for my $id (sort @id){ if (exists $data{$id}){ for my $rec ( @{$data{$id}} ){ printf OUT "%s,%s,%s\n", map { $rec->{$_} } @cols; } }else { print OUT "$id- no data\n"; } } close OUT;
        poj