in reply to Trying to make the code more clear and clean

Hi, try changing your output structure, yes your output structure, to a much more manageable hash of hashes (of hashes of arrays). If you want to be able easily to pull out data later by group or by test, you should only use an array for the innermost values. The, um, values, haha. Anyway, you don't need all those layers of key names. If the object is of test results, you don't need to name it "test". Just get in there and index the tests. If a test hash can only have values by group, key the results by group name. It's much more natural and way easier to build. Something like this:

{ + "root": { 1 => { "XYZ": [ "1234" ], "ABC": [ "6.13.00" ] }, 2 => { "BAB": [ "ASDAS", "12312321" ], "SADA": [ "6.13.00", "1231231" ] } ] }
Then you can change
push @{ $href->{test} }, { group => $group, values => [ sort uniq $value, @{ $href->{values} // [] } ] };
to
my $test_num = 0; for my $file ( @test_files ) { $test_num++; ... for my $line (<FILE>) { push @{ $href->{ $test_num }{ $group } }, $value; } }
Later when you reuse the data to build a report or whatever, you can give it column headers without having to store them in the data object itself.

Hope this helps!


The way forward always starts with a minimal test.

Replies are listed 'Best First'.
Re^2: Trying to make the code more clear and clean
by ovedpo15 (Pilgrim) on Jul 29, 2019 at 23:29 UTC
    Thanks for the detailed answer! We can't use groups/values as a key of the hashes because we send this JSON format to MongoDB. But MongoDB can't handle with dots or dollar signs in the key section so it will fail (we want to allow those chars). Also we want users to user those reports so we need them to be readable and indexing the keys feels less readable (at least to me).

      I see. OK, that answers one question, but it is not clear from what you have posted so far what constitutes a "test". Is a 'test" the same as a "file"? You still haven't posted any sample data with the expected output from that data. That would be helpful in trying to understand your situation.

      Also, are you aware of how fragile it is to try to handle comma-separated values manually by just splitting? Use Text::CSV! (In fact you could encapsulate this whole program into a Text::CSV after_parse callback, but that's a different story...)

      Assuming that one file is one test, testing one href, and that you do *not* want multiple hashes each keyed with 'test', but just one, with one listing for each group, maybe you could use something like:

      use strict; use warnings; use feature 'say'; use JSON; my $results = {}; for my $href ('root') { my $by_group = {}; while (my $line = <DATA>) { my ( $key, $group, $value, $version, $file, $count ) = split(/ +,/, $line); push @{ $by_group->{ $group } }, $value; } my $test = []; for my $group ( keys %{ $by_group } ) { push @{ $test }, { group => $group, values => $by_group->{ $gr +oup } } } $results->{ $href } = $test; } say JSON->new->pretty->canonical->encode($results); __DATA__ bla,ABC,6.13.00,bla,bla,bla bla,XYZ,1234,bla,bla,bla bla,XYZ,tcsh,bla,bla,bla bla,WEA,6.13.00,bla,bla,bla bla,BAB,ASDAS,bla,bla,bla bla,BAB,12312321,bla,bla,bla bla,SADA,6.13.00,bla,bla,bla bla,SADA,12312321,bla,bla,bla
      (^^ that's an SSCCE ...)

      $ perl oved.pl { "root" : [ { "group" : "BAB", "values" : [ "ASDAS", "12312321" ] }, { "group" : "ABC", "values" : [ "6.13.00" ] }, { "group" : "XYZ", "values" : [ "1234", "tcsh" ] }, { "group" : "WEA", "values" : [ "6.13.00" ] }, { "group" : "SADA", "values" : [ "6.13.00", "12312321" ] } ] }

      Hope this helps!


      The way forward always starts with a minimal test.