yasser8@gmail.com has asked for the wisdom of the Perl Monks concerning the following question:

Analytics on Hash Arrays:-

Description of data:

Each server gathers metrics FC_IO_BY_R and FC_IO_BY_W for every one minute, and these metrics values are cumulative and also number of servers are unknown(can be upto 16 servers).

Requirement:-

I need to report metric values by subtracting the current metric value with the previous metric value(oldest one) within the Hour or Day group on the same server and get the definite value but not the cumulative value and report it for every one Hour and for each Day by adding values from all the servers. Please note that total number of servers are unknown, it can be derived by first column name in the DATA which total number of distinct server names.

How can I get the below report ? I know there should be some way using Hash Arrays.

Frequency Hour: --------------- CollectionTime FC_IO_BY_R(server01 + server02) FC_IO_BY_W(server01 + server02) 2015-06-23T21 1050 + 360 = 1410 115 + 98 = 213 2015-06-23T22 4342 + 2470 = 6812 775 + 826 = 1601 Frequency Day: --------------- CollectionTime FC_IO_BY_R(server01 + server02) FC_IO_BY_W(server01 + server02) 2015-06-23 5623 + 3452 = 9075 984 + 1042 = 2026 2015-06-24

Below is not complete data, but it should be sufficient enough to get the pattern of data. And also I don't want to mess up this forum. Also please let me know if the requirement is not clear enough.

__DATA__ server01: 2015-06-23T21:58:05-05:00 FC_IO_BY_R 13,5 +50,785 MB server01: 2015-06-23T21:58:05-05:00 FC_IO_BY_W 6,89 +2,224 MB server01: 2015-06-23T21:59:05-05:00 FC_IO_BY_R 13,5 +51,835 MB server01: 2015-06-23T21:59:05-05:00 FC_IO_BY_W 6,89 +2,339 MB server01: 2015-06-23T22:00:05-05:00 FC_IO_BY_R 13,5 +52,066 MB server01: 2015-06-23T22:00:05-05:00 FC_IO_BY_W 6,89 +2,433 MB server01: 2015-06-23T22:01:05-05:00 FC_IO_BY_R 13,5 +53,303 MB server01: 2015-06-23T22:01:05-05:00 FC_IO_BY_W 6,89 +2,590 MB server01: 2015-06-23T22:02:05-05:00 FC_IO_BY_R 13,5 +55,006 MB server01: 2015-06-23T22:02:05-05:00 FC_IO_BY_W 6,89 +2,836 MB server01: 2015-06-23T22:03:05-05:00 FC_IO_BY_R 13,5 +56,007 MB server01: 2015-06-23T22:03:05-05:00 FC_IO_BY_W 6,89 +2,961 MB server01: 2015-06-23T22:04:05-05:00 FC_IO_BY_R 13,5 +56,201 MB server01: 2015-06-23T22:04:05-05:00 FC_IO_BY_W 6,89 +3,086 MB server01: 2015-06-23T22:05:05-05:00 FC_IO_BY_R 13,5 +56,408 MB server01: 2015-06-23T22:05:05-05:00 FC_IO_BY_W 6,89 +3,208 MB server02: 2015-06-23T21:58:54-05:00 FC_IO_BY_R 13,4 +70,021 MB server02: 2015-06-23T21:58:54-05:00 FC_IO_BY_W 7,43 +1,544 MB server02: 2015-06-23T21:59:54-05:00 FC_IO_BY_R 13,4 +70,381 MB server02: 2015-06-23T21:59:54-05:00 FC_IO_BY_W 7,43 +1,642 MB server02: 2015-06-23T22:00:54-05:00 FC_IO_BY_R 13,4 +71,003 MB server02: 2015-06-23T22:00:54-05:00 FC_IO_BY_W 7,43 +1,760 MB server02: 2015-06-23T22:01:54-05:00 FC_IO_BY_R 13,4 +71,334 MB server02: 2015-06-23T22:01:54-05:00 FC_IO_BY_W 7,43 +1,980 MB server02: 2015-06-23T22:02:54-05:00 FC_IO_BY_R 13,4 +71,629 MB server02: 2015-06-23T22:02:54-05:00 FC_IO_BY_W 7,43 +2,196 MB server02: 2015-06-23T22:03:54-05:00 FC_IO_BY_R 13,4 +71,947 MB server02: 2015-06-23T22:03:54-05:00 FC_IO_BY_W 7,43 +2,307 MB server02: 2015-06-23T22:04:54-05:00 FC_IO_BY_R 13,4 +72,575 MB server02: 2015-06-23T22:04:54-05:00 FC_IO_BY_W 7,43 +2,418 MB server02: 2015-06-23T22:05:54-05:00 FC_IO_BY_R 13,4 +73,473 MB server02: 2015-06-23T22:05:54-05:00 FC_IO_BY_W 7,43 +2,586 MB

Replies are listed 'Best First'.
Re: Analytics on Hash Arrays
by akuk (Beadle) on Jun 24, 2015 at 14:07 UTC

    Hi Requirements are not clear enough. But I can help you in making prototype of Hash of Arrays from the above data.

    #!/usr/bin/perl use strict; use warnings; my $file = "File_Path"; open FH, $file, or die "Can't open $file $!\n"; my %hash = (); while(<FH>){ my @array = split(/ +/, $_); $hash{$array[0]}{$array[1]}{$array[2]} = $array[3]; } print Data::Dumper(%hash); # Hash array created in the format # server01: => {2015-06-23T21:58:05-05:00 => FC_IO_BY_R = #13,550,785}

    Now you have a hash with you, if you want to subtract the previous value, you might have to use offset. Might be you have to save previous value in a file or somewhere in database. I hope it helps!!

      Thanks for the help. Let me try to clarify what I am trying to achieve..

      while(<DATA>){ next unless /\w/; my($server,$datetime,$metric,$value) = (split)[0,1,2,3]; my $ddhh = substr $datetime,0,16; my $dd = substr $datetime,0,10; $h{$dd }{$metric}{max} = $value; # How can I assign Maximum val +ue within the group $dd and $metric ? $h{$dd }{$metric}{min} = $value; # How can I assign Maximum val +ue within the group $dd and $metric ? $m{$ddhh }{$metric}{max} = $value; # How can I assign Maximum val +ue within the group $ddhh and $metric ? $m{$ddhh }{$metric}{min} = $value; # How can I assign Maximum val +ue within the group $ddhh and $metric ? }

      I can't use file or database to store, I mean I can't leverage anything outside perl program. My idea my be worst, I am very new to perl, please help..

        See the ternaryconditional operator for one solution.

        $max = ($x < $max) ? $max : $x; # similar for min

        or the postfix if:

        $max = $x if ($x > $max);

        Update: you may also want to pay attention to the initial value if it is not defined. Your min value may always be 0 unless you have minimum values that are less than the default value for undef.

        --MidLifeXis

Re: Analytics on Hash Arrays
by Anonymous Monk on Jun 24, 2015 at 13:55 UTC

    Please show the code you've tried - How do I post a question effectively? Normally, PerlMonks is not a code writing service, and the more effort you put into asking a question, the more help you get.

    You said before "I am working DBA for more than 12 Years, I can meet this requirement in no time if this was in SQL." At the very least, could you describe how you would solve this requirement in SQL?

      In SQL, using Analytical function would meet requirement. Something as shown below

      value-lag(value) over (order by CollectionTime,metric)

      My problem is with assigning the value for Hash key defined as Time in this case, and then Subtract it from latest value to the oldest value.

      No idea how to do it in perl, some hint or suggestions would be very helpful to me.

Re: Analytics on Hash Arrays
by Anonymous Monk on Jun 24, 2015 at 13:50 UTC
    Can you tell me the budget you have for me to do this job?