ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

This question I posted on SO (but didn't get any comments) - (link:https://stackoverflow.com/questions/54179876/reorganize-hash-in-specific-format)
Consider the following hash of files:
%files_data = { './GetOpt.pm' => { 'pid' => { '56061' => 1, '56065' => 1 } }, 'file1' => { 'pid' => { '56061' => 2 } }, 'file2' => { 'pid' => { '56065' => 2 } }, './src/bin/perl' => { 'pid' => { '56061' => 1, '56065' => 1 } } };
Also consider the following hash of process_data:
%process_data = ( '56061' => { 'parent' => 'NA', 'name' => 'file1' }, '56069' => { 'parent' => '56065', 'name' => 'echo Hello_file1' }, '56062' => { 'parent' => '56061', 'name' => 'echo Hello_file2' }, '56065' => { 'parent' => '56061', 'name' => 'file2' } );
I would like to iterate through the `$files_data` hash and for each file get the chain of files.
So I'll get the following hash:
%hash = ( 'file1' => { '/src/bin/perl' => 1, 'file2' => { '/src/bin/perl' => 1, './GetOpt.pm' => 1 }, './GetOpt.pm' => 1, } );
I need to follow the pid chain up to the main parent ('NA') for each file.
What would be the most efficient way to solve it? I need some guidance on how to implement it.
I'll try to explain how my logic works and show some code. lets take for example `'./GetOpt.pm'` file. It has a pid `56061` so we go to the `%process_data` and see `'file1'` (which is a file). Also we see that the parent of `56061` is `NA` so we stop and get:
file1 => ./GetOpt.pm But `./GetOpt.pm` has another pid - `56065` so we go to `56065` and see `file2` (which is a file). Then we go to `56061` which has `file1` (which is s file). so we get:
file1 => file2 => ./GetOpt.pm
Combine it:
file1 => { ./GetOpt.pm, file2 => ./GetOpt.pm }
I would like to build a process file chain (only with files). the `%files_data` contains valid files and `%process_data` contains the hierarchy of the process we need to follow.
Some of what I tried to do:
create_proc_tree(\%process_data,\%files_data); sub create_proc_tree { my ($proc_href,$files_href) = @_; my %hash; while (my ($file, $procs) = each %{$files_href}) { foreach my $pid (keys(%{$procs->{'pid'}})) { my $prev_file = $file; do { my $parent_id = $proc_href->{$pid}{parent}; my $parent_name = $proc_href->{$pid}{name}; if ($prev_file eq $parent_name) { $hash{$parent_name} = 1; } else { $hash{$parent_name} = { (%{ $hash{$prev_file} // { +} }, $prev_file) }; delete($hash{$prev_file}); } $prev_file = $parent_name; my $parent_name = $proc_href->{$parent_id}{name}; } while(defined($parent_name)); } } print Dumper(\%hash); # Printing for debug }
It does not quite do what I want. The hierarchy is not valid. I'm not sure what is wrong with the algorithm, it feels like true but the output is not as expected.
I think that its because I counted the data to many times without deleting.
It made me think that this algorithm is not efficient and messy. I have found out that we can use 'eval' in order to convert string to hash:
use Data::Dumper; $abc = "Mouse=>Jerry, Cat=>Tom, Dog=>Spike"; my %hash = eval( "( $abc )" ); print Dumper(\%hash);
Maybe a better approach will be to build a string like and then convert it to hash (although I read a comment saying that it is a bad why to use eval).
Also, I though of finding first the Main parent file and the go until I get the current file. How should I solve this issue? What would be the best approach?

Replies are listed 'Best First'.
Re: Reorganizing hash
by choroba (Cardinal) on Jan 14, 2019 at 14:30 UTC
    See my reply on StackOverflow. Also, please fix the link to SO to [https://stackoverflow.com/questions/54179876/reorganize-hash-in-specific-format].
    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]
Re: Reorganizing hash
by LanX (Saint) on Jan 14, 2019 at 14:18 UTC
    > So I'll get the following hash:

    what happened to these entries?

    '56069' => { 'parent' => '56065', 'name' => 'echo Hello_file1' }, '56062' => { 'parent' => '56061', 'name' => 'echo Hello_file2' },
    Why are they skipped?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

    edit

    Same question in disguise?

    Converting hash structure into a special tree

Re: Reorganizing hash
by LanX (Saint) on Jan 15, 2019 at 00:28 UTC
    The following code creates a clean merger of the two hashes as a tree ( or rather directed graph) structure, i.e. it's not assuming that the input is not buggy.

    The resulting merger can easily be post processed to create the desired output.

    FWIW:

    Instead of an recursive algorithm, I used the power of cross-references and auto-vivification.

    This allows to handle possible loops in the input, without risking a non-halting recursive call.

    use strict; use warnings; use Data::Dump qw/pp dd/; my %pid_tree; my %process_data = ( '56061' => { 'parent' => 'NA', 'name' => 'file1' }, '56069' => { 'parent' => '56065', 'name' => 'echo Hello_file1' }, '56062' => { 'parent' => '56061', 'name' => 'echo Hello_file2' }, '56065' => { 'parent' => '56061', 'name' => 'file2' } ); while ( my ($pid,$h_pid) = each %process_data ) { my ($parent,$name) = @{$h_pid}{qw/parent name/}; $pid_tree{$pid}{name} = $name; $pid_tree{$pid}{pid} = $pid; push @{ $pid_tree{$parent}{children} } ,$pid_tree{$pid}; } $pid_tree{NA}{name}= undef; $pid_tree{NA}{pid}= "NA"; my %files_data = ( './GetOpt.pm' => { 'pid' => { '56061' => 1, '56065' => 1 } }, 'file1' => { 'pid' => { '56061' => 2 } }, 'file2' => { 'pid' => { '56065' => 2 } }, './src/bin/perl' => { 'pid' => { '56061' => 1, '56065' => 1 } } ); while ( my ($file,$h_file) = each %files_data ){ while ( my ($attr, $h_attr) = each %$h_file ){ while ( my ($pid, $pid_count) = each %$h_attr ) { #print "$file $attr $pid $pid_count\n"; next if $file eq $pid_tree{$pid}{name}; push @{$pid_tree{$pid}{files}},$file } } } pp $pid_tree{NA};

    OUTPUT

    { children => [ { children => [ { name => "echo Hello_file2", pid => 56062 }, { children => [{ name => "echo Hello_file1", pid => 56069 }], files => ["./src/bin/perl", "./GetOpt.pm"], name => "file2", pid => 56065, }, ], files => ["./src/bin/perl", "./GetOpt.pm"], name => "file1", pid => 56061, }, ], name => undef, pid => "NA", }

    NB: I had to repair corrupted input from the OP. Again.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice