Comment has asked for the wisdom of the Perl Monks concerning the following question:

Thank you all in advance for your help!

It seems that I have setup an overly complicated multidimensional associated array and I was wondering if I could get some suggestions on how to simplify my data structure. If it turns out that I cannot simplify my multidimensional array, than I was hoping to get a few tips on how to work with this data. Here is a bit of background on the situation:

BACKGROUND- I have multiple data sets each containing points in 3D space. I have gone ahead and calculated all of the possible distances between any two points between and within each data set. For example (using set notation), let's say I have Set1 = {a,b,c}, Set2 = {d,e,f,g}, and Set3 = {h,i} (unfortunately, in real life the points do not have unique identifiers relative to other sets; for example, the identifier for "a","d", and "h" would all be "1" and the identifier for "b","e", and "i" would all be "2"). I believe I have setup a hash of hashes of hashes of hashes to hold my data. Therefore, the distance between "a" and "d" would be held in $matrix{set1}{a}{set2}{d}, the distance between "h" and "i" would be held in $matrix{set3}{h}{set3}{i}, and so on. I should also note that distances are not redundant. Therefore, I have stored the distance from "a" to "b" but not from "b" to "a". I have also not calculated the distance between a point and itself (i.e. from "a" to "a" or "b" to "b"). This was done to save memory and calculation time.

GOAL- I would like to print out an Excel file that contains all of the distances between the points of any two sets. Therefore, I would like to have a file containing Set1 vs. Set2. Using set notation, the Set1vSet2.xls would contain {AD,AE,AF,AG,BD,BE,BF,BG,CD,CE,CF,CG} where AD represented the distance between point "a" and point "d". Ultimately, I would like to have every combination of sets: Set1vSet1.xls, Set1vSet2.xls, Set1vSet3.xls, Set2vSet2.xls, Set2vSet3.xls, and Set3vSet3.xls.

WHERE I AM NOW- Here is some pseudocode on how I think this could work:

foreach firstSet { foreach firstPoint { foreach secondSet { foreach secondPoint { print distance to firstSetVsecondSet.xls } } } }
I think this code would loop through the entire matrix and print out the necessary point in the correct files. Unfortunately, I'm not sure how to implement this with my current data structure setup.

I hope I have made this as clear as possible. Please let me know if I can explain anything further. Many many thanks for your help!

Replies are listed 'Best First'.
Re: Hash of Hashes of Hashes of Hashes
by BrowserUk (Patriarch) on Sep 30, 2010 at 20:28 UTC
    I have gone ahead and calculated all of the possible distances between any two points between and within each data set. ...

    For example (using set notation), ...

    If you:

    1. posted your code for performing your calculations;

      Doing this shows your effort, and gives you the opportunity do step 2.

      It may also be the case that it would be more simple and economical to produce the required out as you generate it, rather than building your complicated data structure, and then having to iterate it a second time to produce the required output format.

    2. used Data::Dumper to display: some (very) simple inputs; and the resulting outputs;

      Displaying your inputs and outputs in this form will likely allow more monks to understand your requirements than your english description, couched in set notation.

    3. manually constructed CSV data showing how you would like the results above to be written for later import by Excel;

      Trying to write Excel format files directly just complicates matters, and excludes anyone who doesn't use Excel from testing any solution they might propose.

      Excel is perfectly happy to import CSV. CSV is very easy to write.

    As is, your question would require any interested monk to reproduce your combinatory distances calculating code, before even attempting to answer your question.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Hash of Hashes of Hashes of Hashes
by muba (Priest) on Oct 01, 2010 at 01:18 UTC

    Something like this?

    Note: I don't know much about Excell, or its files, let alone how to programmatically create them. But there are modules out there for that. Check them out.

    # Iterate over all sets for my $set1_k (keys %matrix) { # Iterate over points in this set for my $point1_k (keys %{$matrix{$set1_k}}) { # Iterate over sets this point's entry refers to for my $set2_k (keys %{ $matrix{$set1_k} ->{$point1_k} }) { my $filename = join("", $set1_k, "v", $set2_k, ".xls"); # TODO: Create the Excell file. # Iterate over points in that set for my $point2_k (keys %{ $matrix{$set1_k} ->{$point1_k} ->{$set2_k} }) { # TODO: Populate the Excell file # The data that you'd want to write to the Excell # file is accessable under # $matrix{$set1_k}->{$point1_k}->{$set2_k}->{$point2_k +} # in this loop. } # TODO: if necessary (check module docs): close Excell fil +e } } }

      Thank you very much muba! That was exactly what I was looking for. I was mainly concerned with the concept of the hash of hashes and you certainly covered that. My plan is to use Spreadsheet::Write for creating and writing to the Excel sheet and I can follow the online documentation for that. I really appreciate your help.