ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks!

I have a data structure like so:

{ "/": { "type": "dir", "files": [ { "p": { "type": "dir-link", "source": "/nfs/data/project", "files": [ { "xa": { "type": "dir", "files": [ { "tools": { "type": "dir", "files": [ { "sd": { "type": "dir", "files": [ { "su2av": { "type": "dir", "files": [ { "duf": { "type": "dir", "files": [ { "0.4.0": { "type": "dir-link", "source": "/nfs/vsa/proj +ect/xa/vuvua/su2av", "files": [ { "bin": { "type": "dir", "files": [ { "duf": { "type": "lin +k-file", "source": ". +run" } }, { ".run": { "type": "fil +e" } } ] } } ] } } ] } } ] } } ] } } ] } } ] } } ] } }, { "nfs": { "type": "dir", "files": [ { "data": { "type": "dir", "files": [ { "project": { "type": "dir", "files": [] } } ] } }, { "vsa": { "type": "dir", "files": [ { "project": { "type": "dir", "files": [ { "xa": { "type": "dir", "files": [ { "vuvua": { "type": "dir", "files": [ { "su2av": { "type": "dir", "files": [] } } ] } } ] } } ] } } ] } } ] } } ] } }

It contains dirs, dir-links, files, file-links. I want to create a subroutine that returns an a array of directories, array of files and array on links. For example for directories I get:
/p/xa/tools/sd/su2av/duf/0.4.0/bin /nfs/data/project /nfs/vsa/project/xa/vuvua/su2av
For links it will be a bit different because I should have source and target so I will do this part myself. How can I get all the paths (dirs/files)? I thought of using recursion as I don't know how deep the nested hash is. How can I do it without grep? my code to test it:
my $j = '{ "/": { "type": "dir", "files": [ { "p": { "type": "dir-link +", "source": "/nfs/data/project", "files": [ { "xa": { "type": "dir", + "files": [ { "tools": { "type": "dir", "files": [ { "sd": { "type": +"dir", "files": [ { "su2av": { "type": "dir", "files": [ { "duf": { " +type": "dir", "files": [ { "0.4.0": { "type": "dir-link", "source": " +/nfs/vsa/project/xa/vuvua/su2av", "files": [ { "bin": { "type": "dir" +, "files": [ { "duf": { "type": "link-file", "source": ".run" } }, { +".run": { "type": "file" } } ] } } ] } } ] } } ] } } ] } } ] } } ] } +} ] } }, { "nfs": { "type": "dir", "files": [ { "data": { "type": "di +r", "files": [ { "project": { "type": "dir", "files": [] } } ] } }, { + "vsa": { "type": "dir", "files": [ { "project": { "type": "dir", "fi +les": [ { "xa": { "type": "dir", "files": [ { "vuvua": { "type": "dir +", "files": [ { "su2av": { "type": "dir", "files": [] } } ] } } ] } } + ] } } ] } } ] } } ] } }'; my $obj = decode_json($j);

Replies are listed 'Best First'.
Re: How to iterate over nested hash and get all paths?
by tybalt89 (Monsignor) on Mar 27, 2021 at 18:34 UTC

    Here's my guess as to what you want, at least it gets the directories you showed.

    #!/usr/bin/perl use strict; # https://perlmonks.org/?node_id=11130470 use warnings; use JSON::PP; my $j = '{ "/": { "type": "dir", "files": [ { "p": { "type": "dir-link +", "source": "/nfs/data/project", "files": [ { "xa": { "type": "dir", "fi +les": [ { "tools": { "type": "dir", "files": [ { "sd": { "type": "dir", "fil +es": [ { "su2av": { "type": "dir", "files": [ { "duf": { "type": "dir", "files" +: [ { "0.4.0": { "type": "dir-link", "source": "/nfs/vsa/project/xa/vuvua/su +2av", "files": [ { "bin": { "type": "dir", "files": [ { "duf": { "type": "li +nk-file", "source": ".run" } }, { ".run": { "type": "file" } } ] } } ] } } ] } } + ] } } ] } } ] } } ] } } ] } }, { "nfs": { "type": "dir", "files": [ { "dat +a": { "type": "dir", "files": [ { "project": { "type": "dir", "files": [] } +} ] } }, { "vsa": { "type": "dir", "files": [ { "project": { "type": "dir", +"files": [ { "xa": { "type": "dir", "files": [ { "vuvua": { "type": "dir", "fil +es": [ { "su2av": { "type": "dir", "files": [] } } ] } } ] } } ] } } ] } } ] } +} ] } }'; my $obj = decode_json($j); my $alltypesref = findtypes( $obj ); use Data::Dump 'dd'; dd $alltypesref; sub removeprefixes { local $_ = join "\n", grep $_ ne '/', (sort @_), ''; s/^(\N*)\n(?=.*^\1\/)//gm; split /\n/; } sub findtypes { my ($obj) = @_; my $types = {}; findpath($obj, '', $types); $types->{dir} = [ removeprefixes @{ $types->{dir} } ]; return $types; } sub findpath { my ($obj, $path, $types) = @_; for my $name ( keys %$obj ) { my $info = $obj->{$name}; push @{ $types->{$info->{type}} }, my $new = "$path/$name" =~ tr[/ +][]sr; findpath( $_, $new, $types ) for @{ $info->{files} }; } }

    Outputs:

    { "dir" => [ "/nfs/data/project", "/nfs/vsa/project/xa/vuvua/su2av", "/p/xa/tools/sd/su2av/duf/0.4.0/bin", ], "dir-link" => ["/p", "/p/xa/tools/sd/su2av/duf/0.4.0"], "file" => ["/p/xa/tools/sd/su2av/duf/0.4.0/bin/.run"], "link-file" => ["/p/xa/tools/sd/su2av/duf/0.4.0/bin/duf"], }
Re: How to iterate over nested hash and get all paths?
by Arunbear (Prior) on Mar 27, 2021 at 18:38 UTC
    Recursion does seem the natural way to tackle this given the heierachical structure of the data. Find out about recursion with accumulators (if it is new to you).

    The following may help (as a last resort)

Re: How to iterate over nested hash and get all paths?
by AnomalousMonk (Archbishop) on Mar 27, 2021 at 18:09 UTC

    Also, please see the Perl Data Structures Cookbook (perldsc), although I have the strong feeling this resource has been recommended to this monk before.

    How can I do it without grep?

    Why do you want to exclude grep from your toolbox? Is this a homework assignment?


    Give a man a fish:  <%-{-{-{-<

Re: How to iterate over nested hash and get all paths?
by LanX (Saint) on Mar 27, 2021 at 16:53 UTC
    In an attempt of guided programming:

    I'd reccommed to write subroutines for each "type".

    A sub dispatch will take a hash ref and call the corresponding sub for "type".

    The sub type_dir will need to loop over the nested array files and call the dispatcher on them.

    Show some efforts and show us your implementation for two types dir and file in a smaller SSCCE

    update

    With less abstraction:

    You could only design sub type_dir as a recursive function calling itself since it's the only nesting one in this case.

    The handling on type could be done in a if-ifelse-else block.

    But IMHO that's messier and less maintainable.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: How to iterate over nested hash and get all paths?
by perlfan (Parson) on Mar 27, 2021 at 15:41 UTC
    How are you creating that hash? Are you using File::Find? Otherwise, getting all paths in this tree is just a depth-first search/traversal. There are Perl examples for this all over the place. Recursion is commonly used for this type of thing.
Re: How to iterate over nested hash and get all paths? (JSON::Path Data::Diver)
by Anonymous Monk on Mar 27, 2021 at 21:36 UTC
Re: How to iterate over nested hash and get all paths?
by LanX (Saint) on Mar 28, 2021 at 00:49 UTC
    > For example for directories I get:

    /p/xa/tools/sd/su2av/duf/0.4.0/bin /nfs/data/project /nfs/vsa/project/xa/vuvua/su2av

    What's the criteria for this?

    Only directories without any contained subdirectories?

    What if a sub dir is of another type?

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery