Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

i want to parse some data from a file having multiple occurences like below

sample paragraph from the file:

Launching series_name (started Sat Jan 11 7:31:47 PST 2020)
SERIES RESULT (series_name) : Pass: 2/0/0
series_name (finished Sat Jan 11 7:31:58 PST 2020)
series_name Run Duration: 0d 00:00:11

output:

Series   Duration
series_name   0d 00:00:11

after i get all the different series name with corresponding duration, i need to sort that in ascending order and write in new file

below is what i have been able to achieve

#!/usr/bin/perl use warnings; use strict; use Data::Dumper; my $filename = 'dfttest.log'; open(FH, '<', $filename) or die $!; my $keyword; while(<FH>){ $keyword = $_; if ($_ =~ /^Launching\s*(.*?)\s*\(.*$/) { print "\n".$1; } } close(FH);
  • Comment on i am trying to parse set of data from a file, sorting that data and writing to a new file
  • Download Code

Replies are listed 'Best First'.
Re: i am trying to parse set of data from a file, sorting that data and writing to a new file
by choroba (Cardinal) on Feb 04, 2020 at 12:28 UTC
    Store the durations in a hash keyed by the ids.

    #!/usr/bin/perl use strict; use warnings; use List::Util qw{ first }; my %duration; my $id; while (<DATA>) { chomp; $id = $1 if /Launching (\S+)/; if (my ($days, $hours, $minutes, $seconds) = /Run Duration: (.*)d +(.*):(.*):(.*)/) { $duration{$id} = [$days, $hours, $minutes, $seconds]; } } for my $id (sort { my $d = first { $duration{$a}[$_] <=> $duration{$b}[$_ +] } 0 .. 3; $duration{$a}[$d] <=> $duration{$b}[$d] } keys %duration ) { print "$id\t"; print $duration{$id}[0], 'd'; print join ':', @{ $duration{$id} }[1, 2, 3]; print "\n"; } __DATA__ Launching series_eleven (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_eleven) : Pass: 2/0/0 series_eleven (finished Sat Jan 11 7:31:58 PST 2020) series_eleven Run Duration: 0d 00:00:11 Launching series_twenty (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_twenty) : Pass: 2/0/0 series_twenty (finished Sat Jan 11 7:31:58 PST 2020) series_twenty Run Duration: 0d 00:00:20 Launching series_minute (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_minute) : Pass: 2/0/0 series_minute (finished Sat Jan 11 7:31:58 PST 2020) series_minute Run Duration: 0d 00:01:00 Launching series_hour (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_hour) : Pass: 2/0/0 series_hour (finished Sat Jan 11 7:31:58 PST 2020) series_hour Run Duration: 0d 01:00:00 Launching series_day (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_day) : Pass: 2/0/0 series_day (finished Sat Jan 11 7:31:58 PST 2020) series_day Run Duration: 1d 00:00:00 Launching series_week (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_week) : Pass: 2/0/0 series_week (finished Sat Jan 11 7:31:58 PST 2020) series_week Run Duration: 7d 00:00:00
    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: i am trying to parse set of data from a file, sorting that data and writing to a new file
by tybalt89 (Monsignor) on Feb 04, 2020 at 20:36 UTC
    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11112358 use warnings; my @answers; /(\S+).*?Duration: (.*)/ and push @answers, "$1 $2\n" while <DATA>; print "Series Duration\n", sort @answers; __DATA__ Launching series_eleven (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_eleven) : Pass: 2/0/0 series_eleven (finished Sat Jan 11 7:31:58 PST 2020) series_eleven Run Duration: 0d 00:00:11 Launching series_twenty (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_twenty) : Pass: 2/0/0 series_twenty (finished Sat Jan 11 7:31:58 PST 2020) series_twenty Run Duration: 0d 00:00:20 Launching series_minute (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_minute) : Pass: 2/0/0 series_minute (finished Sat Jan 11 7:31:58 PST 2020) series_minute Run Duration: 0d 00:01:00 Launching series_hour (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_hour) : Pass: 2/0/0 series_hour (finished Sat Jan 11 7:31:58 PST 2020) series_hour Run Duration: 0d 01:00:00 Launching series_day (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_day) : Pass: 2/0/0 series_day (finished Sat Jan 11 7:31:58 PST 2020) series_day Run Duration: 1d 00:00:00 Launching series_week (started Sat Jan 11 7:31:47 PST 2020) SERIES RESULT (series_week) : Pass: 2/0/0 series_week (finished Sat Jan 11 7:31:58 PST 2020) series_week Run Duration: 7d 00:00:00

    Outputs:

    Series Duration series_day 1d 00:00:00 series_eleven 0d 00:00:11 series_hour 0d 01:00:00 series_minute 0d 00:01:00 series_twenty 0d 00:00:20 series_week 7d 00:00:00

    Guessing on alignment since you didn't put expected output in code tags :(