I'm working with Perl and MongoDB but I have a direct question about querying so I hope someone here can help. I have created the following report with Perl:
{ "name": "test1", + "all_data": [ { "sub_data": [ { "sub_name": "Test1", "sub_path": "GROUP1/Test1", "info": [ { "group": "pkgs", "values": [ "tcsh" ] }, { "group": "tcsh", "values": [ "6.13.00" ] } ] }, { "sub_name": "GROUP2", "sub_path": "GROUP2", "info": [ { "group": "pkgs", "values": [ "tcsh" ] }, { "group": "tcsh", "values": [ "6.13.00" ] } ] }, ], "all_data_name": "ROOT", "all_data_path": "/PATH/TO/ROOT" } ], "username": "erwerwcsd", "timestamp": "1564475903" }
As you can see, I have all_data level which contains an array of objects that each one of them contains sub_data array and all_data_name and all_data_path fields.
The sub_data is an array of object where each one of them contains the sub_name, sub_path and info object.
My goal is to create a query which gets all reports with the name "test1" and the username "erwerwcsd" (I guess we need to use $match). Then I want to combine those reports in the following way:
Merge all reports (implementation could be different) into one main report and remove duplicates by the timestamp so only the late blocks will remain.
In other to explain it, I will use the following example: (I marked it as all_data<index> and sub_data<index>)

First report: (all_data<index>
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version ="4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT
Second report:
all_data1: sub_data1: sub_name: sub2 sub_path: path/to/sub2 info: { group = "ABC", version = "1.5.6","4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT
Third report:
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version = "1.5.6","4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT
Fourth report:
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT
Fifth report:
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" } all_data_name: ROOT_OTHER all_data_path: /PATH/TO/ROOT
Then the merge will be as follows:
Merge of first and second: (Explanation: they have same all_data_name and all_data_path but not sub_name and sub_path)
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version = "4.2.1" } sub_data2: sub_name: sub2 sub_path: path/to/sub2 info: { group = "ABC", version = "1.5.6","4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT
Merge of first and third: (Explanation: will be same as the first report because we take the latest. In that case they have same all_data, same sub_data and same info level)
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version ="4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT
Merge of first and fourth: (Explanation: In that case they have same all_data, same sub_data and but not same info level)
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" },{ group = " +ABC", version ="4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT
Merge of first and fifth: (Explanation: they have different all_data_name)
all_data1: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "ABC", version ="4.2.1" } all_data_name: ROOT all_data_path: /PATH/TO/ROOT all_data2: sub_data1: sub_name: sub1 sub_path: path/to/sub1 info: { group = "XYZ", version = "1.5.6","4.2.1" } all_data_name: ROOT_OTHER all_data_path: /PATH/TO/ROOT
Because of the multi nesting It feels like not efficient to just iterate over each block and also I'm not sure which query operators I should use for that.
I'm looking for a way of combining those reports into one main report (at least just to understand the logic). I hope my question is readable and not so hard to understand (tried to show all possible cases).
Thank you.

EDIT: I understood that I should not do those operations from Mongo side and better to just get the needed reports and create the wanted report with Perl.
So I will get all the reports and put into a hash. Then I should iterate through the first and second arrays. Those arrays can be very big so I feel like it is not so efficient.
I would love to hear some suggestion on how to look at this problem, some interesting efficient way.
Thank you all!

In reply to (OT) Perl and creating a query for MongoDB by ovedpo15

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.