joule has asked for the wisdom of the Perl Monks concerning the following question:
Hi all,
I am attempting to create a multidimensional hash whose keys are the elements of a list. Simply put, I am parsing strings from a file and splitting them into key/val pairs on the '=' character. Then, I split the key itself on the ':' character, and would like to assign the result(s) as keys to a multidimensional hash.
Example:
The string/line read in from the file is 'a1:a2:a3=foo' - as a result, 'a1:a2:a3=foo' is assigned to $_
# $key = 'a1:a2:a3', $val = 'foo'
my ($key, $val) = split(/=/);
# find the number of ':' in string
my $num = map(/:/g, $key) + 1;
# create hash keys - split returns 'a1','a2', and 'a3'
# hash key creation NOT WORKING
%hash = split(/:/, $key, $num);
I'd like to create and assign $hash{'a1'}{'a2'}{'a3'} = 'foo'.
In case you're wondering, I find the number of ':' because this code is used in a loop, and each line may or may not vary with each loop iteration. The last split creates the keys, I just can't figure out the creation and assignment to the hash. I looked at map(), but have yet to come up with a solution.
Thanks for any help provided.
Re: List Values As Multidimensional Hash Keys
by diotalevi (Canon) on Mar 14, 2004 at 22:39 UTC
|
use List::Util 'reduce';
use Data::Dumper;
$_ = "a1:a2:a3=foo";
my ($key,$val) = split(/=/);
my @keys = split /:/, $key;
my $last = pop @keys;
my %hash;
( @keys
? reduce( sub { $a->{$b} ||= {} },
\%hash,
@keys )
: \%hash )->{ $last } = $val;
print Dumper( \%hash );
| [reply] [d/l] |
Re: List Values As Multidimensional Hash Keys
by graff (Chancellor) on Mar 14, 2004 at 23:30 UTC
|
Some useful solutions have been provided, but no matter which approach you choose, you still need to be very confident about the quality of your input data for anything to work as intended. In particular, think what will happen if your input includes any two records like the following:
a1:b2:c3=foo
a1:b2=bar
This would create a logical contradiction: node "b2" shows up as both a leaf node and a parent node (it's supposed to hold both a string and a hash ref). Actually, whichever of these two records happens to come second in the input would obliterate data for the one that came earlier.
Unless you have perfect confidence in the input (that is, you have already tested it for well-formedness), you will want to include sanity checks in your hash-creation logic -- don't assign a scalar value to a hash element if it already exists as a reference, and don't use a hash element as a reference if it already contains a scalar. It may be easiest to add this sort of checking to the recursive solution proposed above -- to wit:
use strict;
use warnings;
use Data::Dumper;
my $tree = {};
while (<DATA>) {
chomp;
my ( $key, $val ) = split /=/, $_, 2;
unless ( $key and $val ) {
warn "Skipped bad input at line $. -- $_\n";
next;
}
my $result = insert( $tree, $val, split( /:/, $key ));
warn "$result -- skipped line $. -- $_\n" if ( $result ne "ok" );
}
print Dumper( $tree );
sub insert {
my ( $tree, $val, @keys ) = @_;
my $key = shift @keys;
my $result;
if ( @keys and exists( $tree->{$key} )) {
if ( ref( $tree->{$key} ) eq 'HASH' ) {
$result = insert( $tree->{$key}, $val, @keys );
} else {
$result = "Tried to overwrite string value as hash ref";
}
}
elsif ( @keys ) {
$tree->{$key} = {};
$result = insert( $tree->{$key}, $val, @keys );
}
elsif ( exists( $tree->{$key} ) and ref( $tree->{$key} ) eq 'HASH'
+ ) {
$result = "Tried to overwrite hash ref with string value";
}
else { # Note: a scalar can still overwrite a prev. scalar
$tree->{$key} = $val;
$result = "ok";
}
return $result;
}
__DATA__
a1:b1:c1=first data record
a1:b2=second data record
a1:b2:c2=third data record
a1:b3:c2=fourth data record
a1:b3:c2=fifth data record
a1:b3=sixth record
a2:b1:c1:d1:seventh data record
a2:b1:c1:d1=eigth data record
__OUTPUT__
Tried to overwrite string value as hash ref -- skipped line 3 -- a1:b2
+:c2=third data record
Tried to overwrite hash ref with string value -- skipped line 6 -- a1:
+b3=sixth record
Skipped bad input at line 7 -- a2:b1:c1:d1:seventh data record
$VAR1 = {
'a1' => {
'b3' => {
'c2' => 'fifth data record'
},
'b2' => 'second data record',
'b1' => {
'c1' => 'first data record'
}
},
'a2' => {
'b1' => {
'c1' => {
'd1' => 'eigth data record'
}
}
}
};
| [reply] [d/l] [select] |
Re: List Values As Multidimensional Hash Keys
by matija (Priest) on Mar 14, 2004 at 21:18 UTC
|
#!/usr/bin/perl -w
my $tree={};
# warning: recursive subroutine
sub insert {
my ($tree,$val,@keys)=@_;
my $key=shift @keys;
unless (defined($tree->{$key})) {
$tree->{$key}={};
}
if (scalar @keys) {
insert($tree->{$key},$val,@keys);
} else {
$tree->{$key}=$val;
}
}
insert($tree,'1',qw(a b c d)); # test 1
insert($tree,'2',qw(a b d e)); # test 2
insert($tree,$val,split(':',$key); # the call you were looking for.
| [reply] [d/l] |
|
I've got a canned implementation like this on CPAN, Data::DRef, which will also do the key splitting for you:
use Data::DRef ( set_value_for_key );
$Data::DRef::Separator = ':';
my ($key, $val) = split /=/;
set_value_for_key( $hash, $key, $value );
| [reply] [d/l] |
Re: List Values As Multidimensional Hash Keys
by kappa (Chaplain) on Mar 14, 2004 at 21:47 UTC
|
Seemed to me rather interesting task :)
@a = qw/a b c d e f/;
$data = 'K';
$data = { pop @a => $data } while @a;
%hash = %$data;
| [reply] [d/l] |
|
Or, building top-down instead of bottom-up:
$_ = 'a1:a2:a3=foo';
my ($key, $val) = split(/=/);
my @keys = split /:/, $key;
my %hash;
my $href = \%hash;
$href = $href->{shift @keys} = (@keys>1) ? {} : $val while @keys;
print Dumper(%hash);
The PerlMonk tr/// Advocate
| [reply] [d/l] |
Please don't use eval for this! (was Re: List Values As Multidimensional Hash Keys)
by merlyn (Sage) on Mar 14, 2004 at 23:43 UTC
|
As usual, this topic comes up every three to six months, and the same "eval" solutions get posted. As usual, I've downvoted any solution I've seen (or will see) in this thread that uses "eval". It's both unnecessarily inefficient, and a big security hole as well. Please use any other solution as a starter.
| [reply] |
|
Being of the ornery sort, this (to me) begs the following question:
Efficiency aside, is there a *safe* way to utilise eval as a solution to this problem?
Not a "good" way, or even a "mediocre" way, just safe?
The intrinsic problem with eval is the possibility of hostile data being introduced into to evaluated string. So, is there a way of rendering the data safe?
The obvious way is via taint checking, and string sanitising with tr or s, but is there a better way?
Not that this should be construed as approval of the idea - the process startup overheads alone should be reason enough to do it any other way!
-R
| [reply] |
|
use strict;
use warnings;
my %hash;
my $a = '1};print "You have just been cracked!\n";#a1:a2:a3=foo';
my ($key, $val) = split /=/, $a, 2;
$key =~ s/:/}{/g;
eval "\$hash{$key}=\"$val\"";
__END__
You have just been cracked!
You would replace the $key =~ s/:/... line with
use Data::Dumper;
$Data::Dumper::Terse = 1;
$Data::Dumper::Useqq = 1;
$key = join '}{', Dumper split /:/, $key, -1;
| [reply] [d/l] [select] |
Re: List Values As Multidimensional Hash Keys
by kvale (Monsignor) on Mar 14, 2004 at 21:16 UTC
|
QM's solution is clever. Here is a prosaic method that works for a known maximum number of keys:
my ($key, $val) = split /=/;
my @keys = split /:/, $key;
if (@keys == 1) {
$hash{ $key[0] } = $value;
}
elsif (@keys == 2) {
$hash{ $key[0] }{ $key[1] } = $value;
}
# etc.
| [reply] [d/l] |
Re: List Values As Multidimensional Hash Keys
by BrowserUk (Patriarch) on Mar 15, 2004 at 02:00 UTC
|
use List::Util qw[ reduce ];
my $line = 'a1:a2:a3=key';
my $href = reduce{ my $r={}; $r->{$b}=$a; $r } reverse split /:|=/, $l
+ine;
print Dumper $href;
$VAR1 = {
'a1' => {
'a2' => {
'a3' => 'key'
}
}
};
Update: It struck me later that hash refs are a distinct advantage as it avoids the collision problem graff brought up.
#! perl -slw
use strict;
use List::Util qw[ reduce ];
use Data::Dumper;
my @AoH;
while( <DATA> ) {
chomp;
push @AoH, reduce{
my $r={}; $r->{$b}=$a; $r;
} reverse split /:|=/, $_;
}
print Dumper \@AoH;
__DATA__
a1:a2:a3=key1
a1:a2=key2
b1:b2:b3:b4:b5:b6:b7:b8:b9=key3
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
| [reply] [d/l] [select] |
Re: List Values As Multidimensional Hash Keys
by QM (Parson) on Mar 14, 2004 at 21:02 UTC
|
use strict;
use warnings;
my $a = 'a1:a2:a3=foo';
my ($key, $val) = split /=/, $a, 2;
$key =~ s/:/}{/g;
my %hash;
eval "\$hash{$key}=\"$val\"";
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] [d/l] |
|
Leving aside the fact that this does not compile under strict as you don't declare %hash, this is a security hole just waiting for a cracker.The string form of eval is *dangerous*, don't use it until after you understand why. Here is a hint....
use strict;
use warnings;
my %hash;
my $a = '1};print "You have just been cracked!\n";#a1:a2:a3=foo';
my ($key, $val) = split /=/, $a, 2;
$key =~ s/:/}{/g;
eval "\$hash{$key}=\"$val\"";
__END__
You have just been cracked!
The print could be any arbitrary code. unlink, rm, shutdown....*any* code, running with the perms of whoever started the script.
| [reply] [d/l] |
|
| [reply] |
|
Your post has been downvoted. Please don't malform internal links again (you see, manually editing the URI is just sooooo incredibly difficult). Of course I am entirely kidding.
| [reply] |
Re: List Values As Multidimensional Hash Keys
by joule (Acolyte) on Mar 15, 2004 at 14:03 UTC
|
Wow! I never imagined I would receive so many well informed replies. You guys rule!
After working (unsuccessfully) with map(), I was attempting to code a recursive function, similar to what matija posted. The eval method is a hack, and a commonly disliked solution one as some of the replies have shown.
My gratitude and appreciation go out to you all. Thanks again. | [reply] |
Re: List Values As Multidimensional Hash Keys
by dragonchild (Archbishop) on Mar 15, 2004 at 14:13 UTC
|
I'm curious - what's the need to have this as a tree? Are you planning on working with all the children at a specific level?
The reason I ask is that if all you want are straight lookups from the root node, it would be easier to have the path to the leaf as the entry itself. So, you can cut out the split on ':' and just have the hash key be "a1:a2:a3". (In other words, flatten the tree.)
------
We are the carpenters and bricklayers of the Information Age.
Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] |
|
The file I'm parsing contains strings which are used for a program's configuration files. The configuration files themselves are stored in various directories/sub-directories. I want to provide a logical grouping of the configuration files while also providing flexibility and extensibility with the program's configuration. Hope that makes sense, it's hard to describe with words... :)
| [reply] |
Re: List Values As Multidimensional Hash Keys
by meredith (Friar) on Mar 15, 2004 at 18:10 UTC
|
Another solution is using (abusing?) list auto-stringification, or multidimensional emulation. (How long should the term be for something so simple?) Instead of all those cool hash-reference trees, you could use a flat hash with namespaced keys, just like your file. If you feed perl a list for a hash key, it will apply a join('', ...) to it. (I believe there is a perlvar to change the separator, though. <looks> Oh, It's $;) After you build the hash this way, walking your tree is is as simple as sort keys.
$; = ':';
my %Hash;
while (<>) {
chomp $_;
my ($key, $val) = split(/=/);
if ($key =~ /:/) {
my @keyparts = split(/$;/, $key);
$Hash{ join("$;", @keyparts) } = $val;
# I'm not sure how to get perl to not scalar-ize it in this ca
+se
# But it's probably better to reduce cargo-cultism ;)
} else {
$Hash{$key} = $val;
}
}
foreach (sort keys %Hash) {
print "$_ \t=> " . $Hash{$_} . "\n";
}
print $Hash{'a1','b1','c1'} . "\n"
mhoward - at - hattmoward.org
| [reply] [d/l] |
|
|