This is a good start, but you are right about the fact that it does not extend very easily past the first couple layers of structure in your data.

The data is logically a hierarchy, but it is not stored in a proper hierarchic structure. If we view each stretch that starts with "Main" and ends with "---" as a "block", the first, second and third blocks form a nesting relation, but then the fourth block goes back up a level, to be a sibling of the second block.

So you need to build a structure as you read in the data, but to do this, you need to be able to jump around within the structure that you are building, based on the key strings provided in each block.

The following approach reads the data one block at a time (as suggested in my other reply), and uses a hash ref to jump around within your main "linked_dsc" hash as each block is read in. Two other hashes (link_parent and link_text) are used to navigate over the main structure, and keep track of the key relations between abbreviations and full strings.

One slight complication in your data is that the "Name" value in the first block is used as a printable label, whereas the "Name" values of the other blocks are just linkage keys, which you don't want to print. So the handling of the first block is a special case, and in the other blocks, we have to replace the linkage key (the abbreviation) with its corresponding full string, after we've used the key to find the right position in the hash structure.

(I also made a couple minor edits to the sample data in the OP, so I have included that below as __DATA__.)

#! /usr/bin/perl use strict; use Data::Dumper; my ( %linked_dsc, $link_ref ); my ( %link_parent, %link_text ); # open (FILE, "./Data.txt") or die "Can't open input: $!"; $/ = "---\n"; # each iteration will read up to the next "---" line while (<DATA>) { my ( $name ) = ( /Name = (.*)/ ); my @actions = ( /Action = (.*)/g ); my @texts = ( /Text = (.*)/g ); if ( $name eq 'Countries' ) { $linked_dsc{$name} = undef; $link_ref = \%linked_dsc; # ref = top of structure } else { $link_ref = $link_parent{$name}; # ref = inner layer $name = $link_text{$name}; } for my $i ( 0..$#actions ) { $link_ref->{$name}{$texts[$i]} = undef; if ( $actions[$i] =~ /Find: (.*)/ ) { $link_text{$1} = $texts[$i]; # keep track of where $link_parent{$1} = $link_ref->{$name}; # we are } } } print Dumper ( \%linked_dsc ); __DATA__ Main Name = Countries End Sub Action = Find: NA Text = North America End Sub Action = Find: EU Text = Europe End --- Main Name = NA End Sub Action = Find: US Text = United States End Sub Action = Find: CA Text = Canada End Sub Action = Find: MX Text = Mexico End --- Main Name = US End Sub Action = Text = Boston End Sub Action = Text = Atlanta End --- Main Name = EU End Sub Action = Text = France End Sub Action = Text = Italy End
When I run that, I get this output, which I think is pretty close to what you want (ignoring the order of hash keys, which is random):
$VAR1 = { 'Countries' => { 'Europe' => { 'France' => undef, 'Italy' => undef }, 'North America' => { 'Canada' => undef, 'United States' => { 'Atlanta' + => undef, 'Boston' +=> undef }, 'Mexico' => undef } } };

In reply to Re^3: Building hash tree from a data file by graff
in thread Building hash tree from a data file by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.