Building hash tree from a data file

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

p>Hi Monks,

I am not able to figure out how to build the hash tree from the data file shown below. Any help/ tips on how to go about this task would be helpful.. Thanks

Main
  Name = Countries
End

Sub
  Action = Find: NA
  Text = North America
End

Sub
  Action = Find: EU
  Text = Europe
End
---
Main
  Name = NA
End

Sub
  Action = Find: US
  Text = United States
End

Sub
  Action = Find: CA
  Text = Canada
End

Sub
  Action = Find: MX
  Text = Mexico
End
---
Main
  Name = US
End

Sub
  Action = 
  Text = Boston
End

Sub
  Action = 
  Text = Atlanta
End
---
Main
  Name = EU
End

Sub
  Action = 
  Text = France
End

etc...
[download]

I have to read 'Main', enter the value of 'Name' to a hash tree (like $ref_list{Countries} = undef ), continue to 'Sub' and if the value of 'Action' has 'Find:', then add the value of 'Text' to hash tree (like $ref_list{Countries}{North America} = undef ) and look for a 'Main' that has the 'Name' = 'NA' and continue building the tree... The final hash tree would look as shown below:-

%ref_list = ( 'Countries' => { 
        'North America' => {
            'United States' => {
                'Boston' => undef,
                'Atlanta' => undef
                },
            'Canada' => undef,
            'Mexico' => undef
            }
        'Europe' => {
                       'France' => undef
                       }
        },
          );
[download]

Comment on Building hash tree from a data file Select or Download Code

Replies are listed 'Best First'.
Re: Building hash tree from a data file by GrandFather (Saint) on Jul 09, 2006 at 23:08 UTC
Where's the code you have tried? We can't help you with your coding problems without seeing where those problems are. Given that you know what the final structure should look like we presume that you have a fair understanding of what you want to achieve and how Perl works so it is reasonable to assume that you have tried something and run into a wall or some such. Unless of course <shock><horror>this is homework</horror></shock>, in which case we really would like to see a little effort shown in any case. DWIM is Perl's answer to Gödel	[reply]
Re^2: Building hash tree from a data file by Anonymous Monk on Jul 11, 2006 at 04:09 UTC
Below is the code I have tried.. I am sure it is not great, but I have given a try.... I am able to go to 2 levels but confused as to how to make this more generic. Two issues I have are:- 1)The way I build the data structure doesn't seem to be right, although it produces the result as it is not generic `$linked_dsc{ $main_var }{ $contitent_hash->{ $link_word } }{ $ret_dsc->{ $key1 } } = undef;` 2) I have to call the subroutine again for each element in the array @new_actions. I don't how to proceed with that. Please provide some hints/ tips to move further.. Read more... (5 kB)	[reply] [d/l] [select]
Re^3: Building hash tree from a data file by graff (Chancellor) on Jul 12, 2006 at 05:11 UTC
This is a good start, but you are right about the fact that it does not extend very easily past the first couple layers of structure in your data. The data is logically a hierarchy, but it is not stored in a proper hierarchic structure. If we view each stretch that starts with "Main" and ends with "---" as a "block", the first, second and third blocks form a nesting relation, but then the fourth block goes back up a level, to be a sibling of the second block. So you need to build a structure as you read in the data, but to do this, you need to be able to jump around within the structure that you are building, based on the key strings provided in each block. The following approach reads the data one block at a time (as suggested in my other reply), and uses a hash ref to jump around within your main "linked_dsc" hash as each block is read in. Two other hashes (link_parent and link_text) are used to navigate over the main structure, and keep track of the key relations between abbreviations and full strings. One slight complication in your data is that the "Name" value in the first block is used as a printable label, whereas the "Name" values of the other blocks are just linkage keys, which you don't want to print. So the handling of the first block is a special case, and in the other blocks, we have to replace the linkage key (the abbreviation) with its corresponding full string, after we've used the key to find the right position in the hash structure. (I also made a couple minor edits to the sample data in the OP, so I have included that below as __DATA__.) Read more... (2 kB) When I run that, I get this output, which I think is pretty close to what you want (ignoring the order of hash keys, which is random): `$VAR1 = { 'Countries' => { 'Europe' => { 'France' => undef, 'Italy' => undef }, 'North America' => { 'Canada' => undef, 'United States' => { 'Atlanta' + => undef, 'Boston' +=> undef }, 'Mexico' => undef } } };` [download]	[reply] [d/l] [select]
Re: Building hash tree from a data file by graff (Chancellor) on Jul 10, 2006 at 03:49 UTC
Looks like a good case for playing with the input record separator variable ($/). Set that to "---\n", and every time you read with `<>` from that file, you'll get a multi-line string, containing the "Main" and all "Sub" blocks. Then, on each whole record, you can look for the pieces you want via regex matches -- e.g.: `my @name = ( /Name:\s+(.)/ ); my @find = ( /Find:\s+(.)/g ); my @text = ( /Text:\s+(.)/g );` [download] (update: added "g" modifier to get all matches) In each case, the strings captured by the parens in the regex will be assigned in order of occurrence to the array. (Note that "." in the regex will not match "\n", so each of the "(.)" captures will extend only to the end of the matched line.) Then all you need to do is work out how to manage the loading of the hash; it looks like each successive record (delimited by "---") will be a successive layer in the hash. Try some things out, and if you have trouble, post what you have tried.	[reply] [d/l] [select]