So I have an array of hashes like this one:
{ 'number_of_subs' => 5, 'report_map' => 'test5', 'number_of_mains' => 2, 'ystem_start_time' => '1564574677317', 'system_id' => '453521412412', 'timestamp' => 1564574676, 'user' => 'asdasvb' 'mains' => [ { 'main_path' => 'play_ground/MAIN', 'subs' => [ { 'info' => [ { 'version' => [ 'tcsh' ], 'group' => 'pkgs' }, { 'version' => [ '6.13.00' ], 'group' => 'tcsh' } ], 'sub_path' => 'GROUP1/Test1', 'sub_name' => 'Test1' }, { 'data' => [ { 'version' => [ 'tcsh' ], 'group' => 'pkgs' }, { 'version' => [ '6.13.00' ], 'group' => 'tcsh' } ], 'sub_path' => 'GROUP2', 'sub_name' => 'GROUP2' }, { 'info' => [ { 'version' => [ '3.14' ], 'group' => 'A' }, { 'version' => [ '2.56' ], 'group' => 'B' }, { 'version' => [ '6.13.00', '6.14.00' ], 'group' => 'C' } ], 'sub_path' => 'Test1', 'sub_name' => 'Test1' } ], 'main_name' => 'MAIN' }, { 'main_path' => 'play_ground/MAIN1', 'subs' => [ { 'info' => [ { 'version' => [ 'tcsh' ], 'group' => 'pkgs' }, { 'version' => [ '6.13.00' ], 'group' => 'tcsh' } ], 'sub_path' => 'TEST2/SUB1', 'sub_group' => 'SUB1' }, { 'info' => [ { 'version' => [ 'tcsh' ], 'group' => 'pkgs' }, { 'version' => [ '6.13.00' ], 'group' => 'tcsh' } ], 'sub_path' => 'TEST2/SUB2', 'sub_name' => 'SUB2' } ], 'main_name' => 'MAIN1' } ], }
As you can see, I have 'mains' level which contains an array of objects that each one of them contains subs array and main_name and maiin_path fields.
The 'subs' is an array of object where each one of them contains the sub_name, sub_path and info object.
I'm trying to build a hash which contains all the latest blocks. Some examples:
In order to explain it, I will use the following example: (I marked it as main<index> and subs<index>)

First report:
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version ="4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT
Second report:
main1: subs1: sub_name: sub2 sub_path: path/to/sub2 info: { group = "ABC", version = "1.5.6","4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT
Third report:
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version = "1.5.6","4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT
Fourth report:
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT
Fifth report:
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" } main_name: ROOT_OTHER main_path: /PATH/TO/ROOT
Then the merge will be as follows:
Merge of first and second: (Explanation: they have same main_name and main_path but not sub_name and sub_path)
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version = "4.2.1" } subs2: sub_name: sub2 sub_path: path/to/sub2 info: { group = "ABC", version = "1.5.6","4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT
Merge of first and third: (Explanation: will be same as the first report because we take the latest. In that case they have same main, same subs and same info level)
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version ="4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT
Merge of first and fourth: (Explanation: In that case they have same main, same subs and but not same info level)
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" },{ group = " +ABC", version ="4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT
Merge of first and fifth: (Explanation: they have different main_name)
main1: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version ="4.2.1" } main_name: ROOT main_path: /PATH/TO/ROOT main2: subs1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" } main_name: ROOT_OTHER main_path: /PATH/TO/ROOT
Each one of the reports contains a 'timestamp' so I though of iterating through each one of the blocks and take the latest but it does not feel efficient,
also I will have to keep a field 'timestamp' for each block and than remove it and the end (because I need to compare the time stamp of each iteration).
I would love to hear some suggestion for an algorithm one how to approach this issue.
I also tried to solve this issue from the DB side (link: https://www.perlmonks.org/?node_id=11103616), but I understood that its better to get the report, convert to data-structure and parse it.
The idea I though about - first of all, for each main block, check if it exists in the output_hash (by checking the main_path and main_name), if not insert it as it is and add the timestamp,
if it does exists, add all the subs blocks that are not already included and check those blocks that are the same and take the latest by the timestamp we saved.
It feels like bad efficiency and bad algorithm. Any ideas?
Thank you all. EDIT: What I did until know:
my %output_reports; foreach my $main (@{$data->{"mains"}}) { my $uniq_main = 1; foreach my $new_main (@{$output_reports{"mains"}}) { if ($main->{"main_path"} eq $new_main->{"main_path"} && $main- +>{"main_name"} eq $new_main->{"main_name"}) { $uniq_main = 0; my $uniq_sub = 1; foreach my $sub (@{$main->{"subs"}}) { foreach my $new_sub (@{$new_main->{"subs"}) { if ($sub->{"sub_path"} eq $new_sub->{"sub_path"} & +& $sub->{"sub_name"} eq $new_sub->{"sub_name"}) { # Stuck here - I need the timestamp } } } } } if ($uniq_main) { push(@{$output_reports{"mains"}},$main); } }
I'm stack because in the "subs" I need to use the timestamp that I don't have.

In reply to Parsing output by ovedpo15

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.