Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I had posted the query earlier and was asked to do some ground work. Below is the code I have tried and now looking forward to your suggestions

Here is the script...

#! /usr/bin/perl use strict; use Data::Dumper; my ( %linked_dsc, $main_var, @actions ); my $maincnt = 0; my $found = 0; my ($t1, $t2); my $contitent_hash = { }; open (FILE, "./Data.txt") or die "Can't find file"; while (<FILE>) { # From start of 1st 'Main' to start of 2nd 'Main', read the lines and # store in appropriate variables # $contitent_hash has abbreviation as key and full name as value # @actions has search patterns to look for in the next pass # for example: 'NA' => 'North America' is in the hash variable and # 'Name = NA' is in the array if ( /^Main$/ ) { $maincnt++; } if ( $maincnt == 1 ) { if ( /^Main/../^End/) { if ( /^\s*(.*)\s=\s(.*)/ ) { if ( $1 eq "Name") { $main_var = $2; #$linked_dsc{ $main_var } = undef; } } } elsif ( /^Sub/../^End/) { if ( /^\s*(.*)\s=\s(.*)/ ) { if ( $1 eq "Action") { ($t1, $t2) = split ( /: /, $2 ); push @actions, ('Name = '. $t2 ); $contitent_hash->{ $t2 } = undef; } elsif ( $1 eq "Text") { $contitent_hash->{ $t2 } = $2; } } } } last if ( $maincnt > 1); } # Self checking... print "---------------------\n"; print "@actions\n"; print "---------------------\n"; foreach my $key ( keys %{ $contitent_hash } ) { print "$key = $contitent_hash->{ $key }\n"; #$linked_dsc{ $main_var }{ $contitent_hash->{ $key } } = undef; } print "---------------------\n"; # End of self check foreach ( @actions ) { print "SEARCH STRING: $_\n"; my (undef, $link_word) = split / = /; my ($ret_dsc, $ret_arr ) = process_next( $_ ); foreach my $key1 ( keys %{ $ret_dsc } ) { # Checking... print "$key1 = $ret_dsc->{ $key1 }\n"; $linked_dsc{ $main_var }{ $contitent_hash->{ $link_word } }{ $ +ret_dsc->{ $key1 } } = undef; } #Checking... #print Dumper ( $ret_dsc ); #print Dumper ( $ret_arr ); } print Dumper ( \%linked_dsc ); # Subroutine to search for the searchstring in the file, start reading + # the subsequent lines and store in appropriate variables until the # next 'Main' is encountered. sub process_next { my $searchstr = shift; my $new_href= { }; my ( @new_actions, $left_val, $right_val); seek ( FILE, 0, 0 ); $found = 0; while ( <FILE> ) { if ( ! $found ) { if ( ! /$searchstr/ ) { next; } else { $found = 1; } } if ( $found ) { if ( /^Sub/../^End/) { if ( /^\s*(.*)\s=\s(.*)/ ) { if ( $1 eq "Action") { ($left_val, $right_val ) = split ( /: /, $2 ); if ( defined $right_val ) { $new_href->{ $right_val } = undef; push @new_actions, ('Name = '. $right_val +); }; #else { print "Not defined\n" }; } elsif ( $1 eq "Text") { if (defined $right_val) { $new_href->{ $right_val } = $2; } else { $new_href->{ $2 } = $2 }; } } } last if ( /Main/ ); } } # Checking... # print "new actions : @new_actions\n"; return $new_href, \@new_actions; }

I am sure it is not great, but I have given a try....

I am able to go to 2 levels but confused as to how to make this more generic. Two issues I have are:-

1)The way I build the data structure doesn't seem to be right, although it produces the result (as it is not generic)

$linked_dsc{ $main_var }{ $contitent_hash->{ $link_word } }{ $ret_dsc->{ $key1 } } = undef;

The data file is not limited to what is shown and hence, this approach becomes clumpsy if the hash tree gets deeper

2) I have to call the subroutine again for each element in the array @new_actions. I don't how to proceed with that.

Please help

Here is the data file...

Line nos added for reference.. 1 Main 2 Name = Countries 3 End 4 5 Sub 6 Action = Find: NA 7 Text = North America 8 End 9 10 Sub 11 Action = Find: EU 12 Text = Europe 13 End 14 15 Main 16 Name = NA 17 End 18 19 Sub 20 Action = Find: US 21 Text = United States 22 End 23 24 Sub 25 Action = Find: CA 26 Text = Canada 27 End 28 29 Sub 30 Action = Find: MX 31 Text = Mexico 32 End 33 34 Main 35 Name = US 36 End 37 38 Sub 39 Action = 40 Text = Boston 41 End 42 43 Sub 44 Action = 45 Text = Atlanta 46 End 47 48 Main 49 Name = EU 50 End 51 52 Sub 53 Action = 54 Text = France 55 End 56 57 Sub 58 Action = 59 Text = Italy 60 End

Output looks as follows:-

--------------------- Name = NA Name = EU --------------------- NA = North America EU = Europe --------------------- SEARCH STRING: Name = NA US = United States CA = Canada MX = Mexico SEARCH STRING: Name = EU Italy = Italy France = France $VAR1 = { 'World' => { 'Europe' => { 'France' => undef, 'Italy' => undef }, 'North America' => { 'Mexico' => undef, 'United States' => undef, 'Canada' => undef } } };

Replies are listed 'Best First'.
Re: Building hash tree from data file -contd
by GrandFather (Saint) on Jul 11, 2006 at 19:47 UTC

    I suspect there is a little XY Problem going on here. However the following code builds a lookup hash that stores translations from codes to names and lists of names associated with names:

    use warnings; use strict; use Data::Dump::Streamer; my %lookup; my $main; my $action; my $text; while (<DATA>) { chomp; if (m/^Main\b/ .. m/^End\b/) { # Process Main/End blocks next if ! /Name = (\w+)/; $main = $1; } elsif (m/^Sub\b/ .. m/^End\b/) { # Process sub/end blocks $action = $1, next if /Action = Find: (\w+)/; $text = $1, next if /Text = (.*)/; next if ! /^End/ or ! defined $text; # Here if we have an End and $text is defined $lookup{$action} = $text if defined $action; my $key = $main; $key = $lookup{$key} if $key =~ /^\w\w$/; push @{$lookup{$key}}, $text; $action = undef; $text = undef; } } Dump (\%lookup);

    Prints:

    $HASH1 = { CA => 'Canada', Countries => [ 'North America', 'Europe' ], EU => 'Europe', Europe => [ 'France' ], MX => 'Mexico', NA => 'North America', "North America" => [ 'United States', 'Canada', 'Mexico' ], "United States" => [ 'Boston', 'Atlanta' ], US => 'United States' };

    DWIM is Perl's answer to Gödel

      Thanks for the responses

      I was also trying to use the recursive functions to read and print from the data file.

      The code doesn't have the logic to build the tree but I was trying to get the sequence correctly.

      Looking at your scripts, I think I should scrap this and look to build on what you provided.

      Below is the code:-

      my @arr = ( 'NA', 'EU' ); my $base = 'Countries'; process_full_list( $base, @arr ); sub process_full_list { my ($base1, @arr1 ) = @_; my $elem; print "############################\n"; print "In process full list: $base1 : @arr1\n"; print "############################\n"; foreach $elem ( @arr1 ) { process_next( $base1, $elem); } } sub process_next { my ( $r_base, $r_elem ) = @_; my ( $new_base, $new_elem, @new_arr); print "\tIn process next: $r_base : $r_elem\n"; # The if loop is temporary. It will be replaced by a # sub that will fetch the next array to process by # reading from the file based on the search string. if ( $r_elem eq 'NA' ) { @new_arr = ( 'US', 'CA' ); } elsif ( $r_elem eq 'EU') { @new_arr = ( 'FR', 'IT' ); } elsif ( $r_elem eq 'US') { @new_arr = ( 'NY', 'LA' ); } elsif ( $r_elem eq 'CA') { @new_arr = ( 'TO', 'VC' ); } if( defined ( $r_elem ) && (@new_arr) ) { print "\tnew_base = $r_elem: new_arr=@new_arr\n"; process_full_list( $r_elem, @new_arr ); } }

      The output for the above is

      ########################################### In process full list: Countries : NA EU ########################################### In process next: Countries : NA new_base = NA: new_arr=US CA ########################################### In process full list: NA : US CA ########################################### In process next: NA : US new_base = US: new_arr=NY LA ########################################### In process full list: US : NY LA ########################################### In process next: US : NY In process next: US : LA In process next: NA : CA new_base = CA: new_arr=TO VC ########################################### In process full list: CA : TO VC ########################################### In process next: CA : TO In process next: CA : VC In process next: Countries : EU new_base = EU: new_arr=FR IT ########################################### In process full list: EU : FR IT ########################################### In process next: EU : FR In process next: EU : IT
Re: Building hash tree from data file -contd
by bsdz (Friar) on Jul 11, 2006 at 17:21 UTC
    I would use a recursive function. First parse your data and keep it in a temporary hash table. Then use this recursively to substitute your "Find:" strings. Something like this: -
    use strict; use Data::Dumper; my %H; while (<DATA>) { my $rootkey; my @elements; if (/Main/) { while (<DATA>) { $rootkey = "$1" if /^\s*Name\s*=\s*(.*)$/; unshift(@elements, "$1") if /^\s*(?:Action|Text)\s*=\s*(.*)$/; last if /---/; } } $H{$rootkey} = {@elements}; } my %G = process(\%H, 'Countries'); print Dumper(\%G); sub process { my ($href, $key) = @_; my %G; foreach my $k (keys %{$href->{$key}}) { $G{$key}{$k} = $href->{$key}{$k} =~ /^Find:\s+(.*)$/ ? {process(\%H, $1)} : $href->{$key}{$k}; } return %G; } __DATA__ Main Name = Countries End ... The rest of your data..

      Thanks a lot for your help but I am still having trouble with the output

      The output from your code is

      $VAR1 = { 'Countries' => { 'Europe' => { 'EU' => { 'France' => '' } }, 'North America' => { 'NA' => { 'Mexico' => {}, 'United States' => { 'US' => { 'Atlanta' => '', 'Boston' => '' } }, + 'Canada' => {} } } } };

      The 'find' codes such as 'EU','NA', 'US', etc. are still geting added to the data structure. I tried to modify your code without much success. I am still trying but it would be of great help, if you can help..

Re: Building hash tree from data file -contd
by graff (Chancellor) on Jul 12, 2006 at 06:40 UTC
    (sigh)

    I posted some suggested code in your initial thread (I even tested it and it seemed to work -- it's nested down a bit, here). Now you have two places to look for answers to your one problem. Posting repeated content in different threads is a drag.

    Are you staying anonymous for some particular reason? If you could manage to make up a user-id for yourself, it would be easier for you to keep track of your posts and the replies that you get. You could also use the ChatBox to draw people's attention to posts where you haven't gotten the answer you need yet.

    Then there would be no need for you to keep posting the same code and the same data in multiple threads.

      Sorry about all the confusion

      I had updated the original link but as I didn't get any response, I thought no one will ever look at it. Hence, I created a new post with reference to the original. ...a bit of Panic...

      As suggested, I have created an id for my own.

      Thanks for your comments and code.. It works nicely..

      Update: Using 'Tie', I was able to get the output in the order it was fed into the data structure..

      use Tie::Autotie 'Tie::IxHash'; tie my %linked_dsc, 'Tie::IxHash';

      some progress...

      I want to try few other things by shuffling the blocks in the data file,etc..like shown below and modify the code as required.

      Update 2:Atlast, I managed to achieve (script might be clumpsy though compared to your standards) what I wanted- modified the code to read and build the tree irrespective of the order in which the block appears (except for the first one) and also print the hash in the order it was read inside..

      You guys are great!!! for the help, guidance and what not.. whatelse I can say!!! special thanks to graff (as I stole his code to create the structure and used that as a subroutine)

      Again.. thanks for your time and guidance