ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

Hello
Consider the following content of a file:

15,10,name3 10,#,name1 12,10,name2 5,12,name4 8,5,name4

Each line is in the following format:

id,parent-id,name
Notice that the the great parent (the one without a parent) has # in the parent-id field.
I'm trying to create a new list which each line is a path from each id to its parent-id.
The expected output of the example I gave:

name1 name2,name1 name3,name1 name4,name2,name1 name5,name4,name2,name1

I was trying to solve this issue with a loop and an array.
Problem is, I don't know how many fields should the array contain.
How can I solve this issue in the cleanest and simplest way possible, without using any external modules?
I feel like a loop over the file should solve it, but can't figure the right order.

Replies are listed 'Best First'.
Re: Reorganizing the content of a file
by Discipulus (Canon) on Dec 18, 2018 at 10:33 UTC
    Hello ovedpo15,

    I suspect an array will be not enough..

    More: depending on the order your elements are delcared you may need two parse of the same data. Hashes are useful. Hashes of Hashes (HoH see perldsc ) are powerful.

    In plain english would be something like:

    Build up a hash to hold association of names with their characteristics by id (HoH: $assoc{ $current_id } = { name => $current_name, $parent => $current_parent } ) Add an exception: if parent is pound put an empty string or undef.

    Now you have to recurse the above structure to build up the list of ancestors: foreach id print the name and if has a parent, get the name of his parent and if the latter has a parent get the name of this parent too, and if the latter has a parent...

    Recursion! A sub that call itself.. See https://perlmaven.com/recursive-subroutines

    UPDATE 21:30 gmt+1 as many monks provided their solution, here is mine, traslating in perl what above stated in english:

    use strict; use warnings; my %assoc; while(<DATA>){ chomp; my @fields = split ',',$_; $assoc{ $fields[0] } = { name => $fields[2], parent => $fields[1] eq '#' ? '' : $fields[1], } } sub get_parents{ my $href = shift; # exit condition first in recursive subs return '' unless defined $assoc{ $href->{parent} }{name}; print "$assoc{ $href->{parent} }{name} "; # recursive call get_parents( $assoc{ $href->{parent} } ); } foreach my $id (keys %assoc){ print "$assoc{$id}->{name} "; get_parents($assoc{$id}); print "\n"; } __DATA__ 15,10,name3 10,#,name1 12,10,name2 5,12,name4 8,5,name5 #output name4 name2 name1 name1 name5 name4 name2 name1 name2 name1 name3 name1

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Reorganizing the content of a file
by kcott (Archbishop) on Dec 18, 2018 at 14:24 UTC

    G'day ovedpo15,

    I used these steps:

    1. Read the file and collect all data in a hash.
    2. Generate arrays of ancestry elements for each ID.
    3. Sort array elements; join; print.

    Here's the code:

    #!/usr/bin/env perl use strict; use warnings; my $root = '#'; my %tree; while (<DATA>) { chomp; my ($id, $pid, $name) = split /,/; $tree{$id} = {name => $name, pid => $pid}; } my @paths; push @paths, concat($_, \%tree) for keys %tree; print join(',', @$_), "\n" for sort { @$a <=> @$b || $a->[0] cmp $b->[0] } @paths; sub concat { my ($id, $tree, $path) = @_; $path = [] unless defined $path; push @$path, $tree->{$id}{name}; if ($tree->{$id}{pid} ne $root) { concat($tree->{$id}{pid}, $tree, $path); } return $path; } __DATA__ 15,10,name3 10,#,name1 12,10,name2 5,12,name4 8,5,name5

    Output:

    name1 name2,name1 name3,name1 name4,name2,name1 name5,name4,name2,name1

    Note that I changed the last line of your data: s/8,5,name4/8,5,name5/.

    Also, it's completely unclear from your expected output, what ordering you wanted in the general case. I've written a sort to give the expected output for a single set of input data. Ask yourself if, given different input, "name2,name1" was "name2,name6,name1", what would the order be? Adjust my "sort { ... }" as required.

    — Ken

Re: Reorganizing the content of a file
by haj (Vicar) on Dec 18, 2018 at 14:03 UTC

    For the unimaginative ones: The last input line ought to end in name5.

    A rather simple solution would use more than one hash (or array). Actually, both structures are fine. If your biggest id is much larger than the number of names then hashes might be more effective. The following isn't optimized because it unravels the ancestry several times - if performance is an issue than this could be avoided (with a bit of recursion).

    use strict; use warnings; my %parent_id = (); my %id_to_name = (); # Building the tree while (my $line = <DATA>) { chomp $line; my ($id,$parent_id,$name) = split /,/, $line; $parent_id{$id} = $parent_id; $id_to_name{$id} = $name; } # Dumping the tree my %name_to_id = reverse %id_to_name; for my $name (sort keys %name_to_id) { my $id = $name_to_id{$name}; my $parent_id = $parent_id{$id}; my @ancestors = ($name); while (my $name = $id_to_name{$parent_id}) { push @ancestors,$name; $parent_id = $parent_id{$parent_id}; } print join(',',@ancestors),"\n"; } __DATA__ 15,10,name3 10,#,name1 12,10,name2 5,12,name4 8,5,name5
Re: Reorganizing the content of a file
by Laurent_R (Canon) on Dec 18, 2018 at 17:35 UTC
    Another proposal:
    use strict; use warnings; use feature 'say'; my %tree; while (<DATA>) { chomp; my ($id, $pid, $name) = split /,/; $tree{$id} = {pid => $pid, name => $name}; } for my $id (sort keys %tree) { my @parent_list; my $temp_id = $id; while (exists $tree{$temp_id}) { push @parent_list, $tree{$temp_id}{name}; $temp_id = $tree{$temp_id}{pid}; } say join ",", @parent_list; } __DATA__ 15,10,name3 10,#,name1 12,10,name2 5,12,name4 8,5,name5
    Output:
    name1 name2,name1 name3,name1 name4,name2,name1 name5,name4,name2,name1
      Amazing: Your output is sorted by id - alphabetically - and still gives the same ordering as sorting by names :)
        Yeah, right, lucky that the data happened to be like that. I really did not try very hard to sort, as the OP did not specify anything about that, I just included a sort on the keys to tidy up the output, and it turned out to be exactly the output sample provided in the OP.
Re: Reorganizing the content of a file
by 1nickt (Canon) on Dec 18, 2018 at 12:21 UTC

    Hi, I can't be sure from your sample data how the mappings are supposed to work. There's no 'name5' in the input. Please show some code you've tried in an SSCCE.

    Also consider why you need to store the full paths if each node can only have one parent: don't you only need to store the direct parent for each?

    Also, consider looking on the CPAN for modules that handle this kind of data path mapping.

    Hope this helps!


    The way forward always starts with a minimal test.
Re: Reorganizing the content of a file
by NetWallah (Canon) on Dec 18, 2018 at 19:37 UTC
    Offering an object-oriented version (Taking some liberties offered by perl over "strict OO"):
    #!/usr/bin/env perl use strict; use warnings; {package Item; my %By_id; # Collection sub new{ my ($class,%attr) = @_; die "ID is required" unless $attr{id}; return $By_id{$attr{id}}=bless \%attr, $class; } sub GetParentObj{ return undef if $_[0]->{parent} eq "#"; # Root has no parent return $By_id{ $_[0]->{parent} }; } sub GetRoot{ # CLASS method return $By_id{"#"}; } sub GetChildren{ my ($self) = @_; return grep {$_->{parent} == $self->{id}} values %By_id; } sub GetAncestors{ my ($self) = @_; my @ancestors; my $parent = $self; while ($parent = $parent->GetParentObj()){ push @ancestors, $parent; } return @ancestors; } sub GetAll{ # CLASS method my ($sort_att) = @_; $sort_att ||= "name"; # Defaults to name-based return sort {$a->{$sort_att} cmp $b->{$sort_att} } values %By_id; } } # -- Main Code ----- while (<DATA>) { chomp; my ($id, $pid, $name) = split /,/; Item::->new(id=>$id, parent=>$pid, name=>$name); } for my $item (Item::GetAll("name")){ print $item->{name}, map({"," . $_->{name}} $item->GetAncestors()), "\n"; } __DATA__ 15,10,name3 10,#,name1 12,10,name2 5,12,name4 8,5,name5

                    As a computer, I find your faith in technology amusing.

Re: Reorganizing the content of a file
by tybalt89 (Monsignor) on Dec 18, 2018 at 17:56 UTC

    Sigh. No one has done it with regexes yet, and that makes me sad. :)

    #!/usr/bin/perl # https://perlmonks.org/?node_id=1227383 use strict; use warnings; local $_ = join '', sort { $a =~ s/.*,//r cmp $b =~ s///r } <DATA>; for my $id ( /^(\d+)/gm ) { print /^$id,.*,(\S+)/m; $id = $1, print ',', /^$id,.*,(\S+)/m while /^$id,(\d+)/m; print "\n"; } __DATA__ 15,10,name3 10,#,name1 12,10,name2 5,12,name4 8,5,name5
Re: Reorganizing the content of a file
by LanX (Saint) on Dec 18, 2018 at 12:56 UTC
    so name4 is the parent of name4 ?

    5,12,name4 8,5,name4

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice