comment on

I'm working with Perl and MongoDB but I have a direct question about querying so I hope someone here can help. I have created the following report with Perl:

{                                                      
    "name": "test1",                                                  
+    
    "all_data": [                                         
        {                                              
            "sub_data": [                                  
                {                                      
                    "sub_name": "Test1",               
                    "sub_path": "GROUP1/Test1",        
                    "info": [          
                        {                              
                            "group": "pkgs",            
                            "values": [              
                                "tcsh"                 
                            ]                          
                        },                             
                        {                              
                            "group": "tcsh",            
                            "values": [              
                                "6.13.00"              
                            ]                          
                        }                              
                    ]                                  
                },                                     
                {                                      
                    "sub_name": "GROUP2",              
                    "sub_path": "GROUP2",              
                    "info": [          
                        {                              
                            "group": "pkgs",            
                            "values": [              
                                "tcsh"                 
                            ]                          
                        },                             
                        {                              
                            "group": "tcsh",            
                            "values": [              
                                "6.13.00"              
                            ]                          
                        }                              
                    ]                                  
                },                
            ],                                         
            "all_data_name": "ROOT",                       
            "all_data_path": "/PATH/TO/ROOT"
        }
    ],
    "username": "erwerwcsd",
    "timestamp": "1564475903"
}
[download]

As you can see, I have all_data level which contains an array of objects that each one of them contains sub_data array and all_data_name and all_data_path fields.
The sub_data is an array of object where each one of them contains the sub_name, sub_path and info object.
My goal is to create a query which gets all reports with the name "test1" and the username "erwerwcsd" (I guess we need to use $match). Then I want to combine those reports in the following way:
Merge all reports (implementation could be different) into one main report and remove duplicates by the timestamp so only the late blocks will remain.
In other to explain it, I will use the following example: (I marked it as all_data<index> and sub_data<index>)

First report: (all_data<index>

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "ABC", version ="4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
[download]

Second report:

all_data1:
    sub_data1:
        sub_name: sub2
        sub_path: path/to/sub2
        info: { group = "ABC", version = "1.5.6","4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
[download]

Third report:

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "ABC", version = "1.5.6","4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
[download]

Fourth report:

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "XYZ", version = "1.5.6","4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
[download]

Fifth report:

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "XYZ", version = "1.5.6","4.2.1" }
    all_data_name: ROOT_OTHER
    all_data_path: /PATH/TO/ROOT
[download]

Then the merge will be as follows:
Merge of first and second: (Explanation: they have same all_data_name and all_data_path but not sub_name and sub_path)

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "ABC", version = "4.2.1" }
    sub_data2:
        sub_name: sub2
        sub_path: path/to/sub2
        info: { group = "ABC", version = "1.5.6","4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
[download]

Merge of first and third: (Explanation: will be same as the first report because we take the latest. In that case they have same all_data, same sub_data and same info level)

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "ABC", version ="4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
[download]

Merge of first and fourth: (Explanation: In that case they have same all_data, same sub_data and but not same info level)

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "XYZ", version = "1.5.6","4.2.1" },{ group = "
+ABC", version ="4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
[download]

Merge of first and fifth: (Explanation: they have different all_data_name)

all_data1:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "ABC", version ="4.2.1" }
    all_data_name: ROOT
    all_data_path: /PATH/TO/ROOT
all_data2:
    sub_data1:
        sub_name: sub1
        sub_path: path/to/sub1
        info: { group = "XYZ", version = "1.5.6","4.2.1" }
    all_data_name: ROOT_OTHER
    all_data_path: /PATH/TO/ROOT
[download]

Because of the multi nesting It feels like not efficient to just iterate over each block and also I'm not sure which query operators I should use for that.
I'm looking for a way of combining those reports into one main report (at least just to understand the logic). I hope my question is readable and not so hard to understand (tried to show all possible cases).
Thank you.

EDIT: I understood that I should not do those operations from Mongo side and better to just get the needed reports and create the wanted report with Perl.
So I will get all the reports and put into a hash. Then I should iterate through the first and second arrays. Those arrays can be very big so I feel like it is not so efficient.
I would love to hear some suggestion on how to look at this problem, some interesting efficient way.
Thank you all!

In reply to (OT) Perl and creating a query for MongoDB by ovedpo15

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.