rjc has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a newbie Perl question about dynamic generated hashes -

Each line of my small CSV file consists of a number of Folder Titles, followed by the Field titles for that line/record:

INPUT: FirstFolderTitle,SecondFolderTitle,ThirdFolderTitle,Field1,Field2,Field3,Field4\n

I want to split the folder titles into hashes of hashes and put the rest of the fields into an array in the 'deepest' hash:

$BIGLIST{$firsttitle[$i][0]}{$secondtitlename[$i][1]}{$thirdtitlename[$i][2]}{$couldbe1or2moretitles[$i][3]} => [ @RESTOFFIELDS ];

Easy enough when I know how many folder fields to split the line into, but how can I do this dynamically when I don't know what the total number of title fields (i.e. the number of nested anonymous hashes) will be?

I'm not familiar enough with anonymous hashe structure yet to get the data in and out on the fly like this; can anyone help?

Thanks,
rjc

Replies are listed 'Best First'.
Re: Creating Hash of hash of hash dynamically
by davido (Cardinal) on Oct 11, 2004 at 15:20 UTC

    You've really got to figure out a way of knowing which columns in this CSV file are 'fields' and which are 'titles'. The following snippet assumes you've already figured that part of the problem out. It assumes that you've dumped the CSV row into @columns, and furthermore, that you know that the number of 'fields' is four.

    Given that part of the equation being already resolved, here is one way to build up a hash when the number of 'titiles' (number of levels of nested hashes) is unknown. My strategy builds the hash in reverse, starting at the deepest level and working upward. This is the same technique often used to build linked lists in Perl.

    use strict; use warnings; use Data::Dumper; my @columns = qw/ title1 title2 title3 title4 field1 field2 field3 field4 /; my $struct_ref = [ @columns[4..7] ]; # place the 'fields' into an arra +y ref. $#columns = 3; # truncate the original CSV row array, @columns. foreach my $col ( reverse @columns ) { $struct_ref = { $col => $struct_ref }; } print Dumper $struct_ref;

    HTH.


    Dave

Re: Creating Hash of hash of hash dynamically
by kvale (Monsignor) on Oct 11, 2004 at 15:20 UTC
    The first step in the process is to parse the the CSV. This is best done with a module such as Text::xSV.

    Then, you need to figure out how to differentiate titles from other fields. It isn't apparent from your post, so I cannot help you there.

    Finally, populate the hash. A most simple-minded solution would be to create a case structure:

    my @titles = ...; if (@titles == 1) { $BIGLIST{$title[0]} = [ @RESTOFFIELDS ]; } elsif (@titles == 2) { $BIGLIST{$title[0]}{$titles[1]} = [ @RESTOFFIELDS ]; } ...etc.
    This works is you have some small upper bound of titles. If not, you will need a loop that walks the @title array (untested):
    my @titles = ...; my $ref = \%BIGLIST; while (@titles > 1) { my $title = shift @titles; $ref = \($ref->{$title}); } $ref->{$title[0]} = [ @RESTOFFIELDS ];

    -Mark

Re: Creating Hash of hash of hash dynamically
by Roy Johnson (Monsignor) on Oct 11, 2004 at 15:24 UTC
Re: Creating Hash of hash of hash dynamically
by Joost (Canon) on Oct 11, 2004 at 15:23 UTC
    I wonder how you tell the folders and fields apart, why you would want a CSV file with a variable number of values per line, and what you are going to do with the %BIGLIST when you've created it. I think your datastructure might be overly complex.

    But lets say that you REALLY want a tree structure for this kind of thing:

    sub add_line { my ($tree,@line) = @_; my $first_element = shift @line; if (is_a_folder_title($first_element)) { # TODO: implement is_a_fol +der_title() $tree->{$first_element} ||= {}; # create empty hashref if none ex +ists, yet add_line($tree->{$first_element},@line); } else { $tree = [ $first_element, @line ]; } }

    To get the values out again:

    sub get_elements { my ($tree,@folder_names) = @_; return $tree unless @folder_names; my $subtree = $tree->{shift(@folder_names)}; return get_elements($subtree,@folder_names); }
    If you have a seperate list of fields and folders, it's a lot easier to use a "simple" hash of arrays instead:
    $directories{ join("/",@folder_names } = \@fields; my @fields = @{ $directory{ join("/",@folder_names) } };
Re: Creating Hash of hash of hash dynamically
by BrowserUk (Patriarch) on Oct 11, 2004 at 15:13 UTC

    The first question to answer is how do you know the difference between a title and a field?


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: Creating Hash of hash of hash dynamically
by NetWallah (Canon) on Oct 11, 2004 at 15:24 UTC
    Due to the depth of your structure, I recommend putting it into a Tree::Simple.

        Earth first! (We'll rob the other planets later)

Re: Creating Hash of hash of hash dynamically
by rjc (Initiate) on Oct 11, 2004 at 16:31 UTC
    Thanks everyone, I know where I'm going now! -

    I was groping towards davido's approach - starting with deepest hash first - but hit my learning ceiling (I hail from land of JavaScript) so thanks for the lesson, Dave.

    Sorry I didn't give more background info:
    the csv file (maybe 300 or so lines/records) is posted by the client, who gets to decide himself:
    - how many levels of folder depth he wants on his website this month
    - how many fields display for each record

    So the first line of the parsed csv file gives me the FolderTitles list, annotated to tell the script how 'deep' the html site is going to be
    The first few fields on each subsequent line tell me the path where that record will live, and the leftover fields make up the record itself.
    Each line is always the same length.

    I previously hacked up an undignified version of the site using arrays and it's been working fine (and fast) but hashes would be simpler

    Anyway, thanks again,
    rjc

      I wish I could claim that I came up with the "build it in reverse" technique myself. I saw it first in a great book published by O'Reilly & Associates, "Mastering Algorithms with Perl". The book discusses (among many other topics) building linked lists, and one method is surprisingly similar to the solution I provided to your problem. Your situation was a little different, but being familiar with the Algorithms book provided me with the basic concept. I highly recommend the book.

      You can also learn a great deal about complex datastructures and Perl's references from the following POD's:

      • perlreftut - Tutorial on references.
      • perllol - Tutorial on building and manipulating lists of lists, and other similar datastructures using references.
      • perldsc - Perl Datastructure Cookbook... good stuff.
      • perlref - Of course, the definitive source.

      Happy hunting...


      Dave

Re: Creating Hash of hash of hash dynamically
by TheEnigma (Pilgrim) on Oct 11, 2004 at 15:16 UTC
    That's a rather scary looking structure! I think if you told us more exactly what you're trying to accomplish, maybe a simpler structure could be suggested.

    TheEnigma

Re: Creating Hash of hash of hash dynamically
by thospel (Hermit) on Oct 11, 2004 at 19:18 UTC
    I'll ignore the question if this is actually a good idea here (what happens if you want to put a value in a place that's also needed for a deeper part of the tree ?).

    The core loop to do something like this is actually pretty simple. Let's say you have an array of level names, a target value and hash to start from:

    #! /usr/bin/perl -w # Set up some stuff for this example use strict; use Data::Dumper; my %hash; my @levels = qw(a b c); my $value = 5; # The next three lines are the essence: my $work = \\%hash; $work = \$$work->{$_} for @levels; $$work = $value; # Show that it worked print Dumper(\%hash);

    If the number of levels can be 0, the top level assign can be a scalar one, so you'd need to start with a scalar instead of a hash:

    my $hash; my $work = \$hash; $work = \$$work->{$_} for @levels; $$work = $value;

    The advantage of this forward walking way of doing things is that you can apply it several times to put multiple level/value sets in there (as long as you take care that a value doesn't block a subtree)

    However, if there is some character that cannot appear in the titles, it's probably easier to just join all titles using that character as separator and use that as hash key. You can use split if you need to split up such a key into titles again. This leads to a much easier to use and understand datastructure (if there is no such character, you can still get this idea working with some form of escaping or counted packs, but it gets a bit trickier and maybe not worth it anymore), e.g.:

    $hash{join(",", @levels)} = $value;
    In your case the input already has the comma separated titles as a substring, which you could even extract directly after reading a line.