Trying to understand hashes (in general)

james28909 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Trying to understand hashes (in general)
by GrandFather (Saint) on Dec 23, 2014 at 05:45 UTC

Arrays are good for doing array stuff and hashes are good for doing hash stuff. If you set aside how they work under the hood arrays and hashes are nearly identical (in PHP essentially they are identical). The "difference" between arrays and hashes is that arrays are indexed by numbers and hashes are indexed by strings.

Arrays are really good when you have a list of things you want to store and either they naturally are keyed by a number, or have no key but may be ordered. It's really fast to access element in an array by their index (position in the array). Perl arrays are also very efficient at adding and removing elements at the start and end of the array. Arrays tend to be a poor choice if there are large gaps where index values have been skipped.

Hashes are really good where you want to access values by name. Note that there is no reason the name can't be a number so in that sense hashes and arrays can do the same job. Although hashes are pretty quick at looking up values by name, they aren't as fast as arrays. The other major difference is that hashes don't remember the order that elements were inserted so they generally can't be used in a trivial fashion to store ordered data.

Unless you are generating a set of unique values by taking advantage of the fact that hash keys are unique, it seldom makes sense to use a hash to just store keys so on the face of it a hash doesn't make sense for storing a file system's structure because the file system doesn't allow duplicated names in any case. Nested arrays are a much better fit for a file system's structure.

Perl is the programming world's equivalent of English

[reply]

Re^2: Trying to understand hashes (in general)

by james28909 (Deacon) on Dec 23, 2014 at 06:03 UTC

[reply]

Re^3: Trying to understand hashes (in general)

by GrandFather (Saint) on Dec 23, 2014 at 06:42 UTC

What are you trying to achieve with the compare? Depending on the answer an array, a hash or a database may be a good answer, or maybe you don't need to store anything at all. In no case should you need nested loops that run across all combinations of element pairs however.

There is no "one best solution" for all problems. Having a good understanding of what you are trying to achieve very often will point you toward the correct data structure and once you have the data structure right very often everything else just slots into place around it.

Perl is the programming world's equivalent of English

[reply]

Re^4: Trying to understand hashes (in general)

by james28909 (Deacon) on Dec 23, 2014 at 07:51 UTC

Re^5: Trying to understand hashes (in general)

by hexcoder (Curate) on Dec 23, 2014 at 16:26 UTC

Re: Trying to understand hashes (in general)
by Athanasius (Cardinal) on Dec 23, 2014 at 06:04 UTC

Hello james28909,

A hash is an associative array, in which data is stored in key/value pairs. For example, in:

my %hash = (Fred => 'Wilma', Barney => 'Betty', Homer => 'Marge');
[download]

the keys are Fred, Barney, and Homer, and their corresponding values are Wilma, Betty, and Marge, respectively. Now, in your script, the line:

$dirs{$file} = $file;
[download]

adds a new key/value pair to the $dirs hash, and in this pair the key and the value are the same (viz., whatever is stored in $file). This is an unnecessary duplication of the data. It would be more normal in this case to set the value to undef (or possibly 1).

If you do later convert this into a hash of hashes (but you would be better off following GrandFather’s advice and using an array of arrays), then each value will be a reference to an anonymous hash which you create on the fly:

$dirs{$file} = { ... };
[download]

You should study the tutorial perldsc (“Perl Data Structures Cookbook”).

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

[reply]
[d/l]
[select]

Re^2: Trying to understand hashes (in general)

by james28909 (Deacon) on Dec 23, 2014 at 06:36 UTC

Thanks for the info, and the link as well.

[reply]

Re: Trying to understand hashes (in general)
by davido (Cardinal) on Dec 23, 2014 at 07:20 UTC

Just because a hash stores key/value pairs doesn't mean that you need to put anything useful in the value.

C++ comes with std::unordered_set and std::unordered_map. The first one is designed for just a set of keys. The second is a set of keys that map to values. The second one is more like Perl's hashes. But the fact that the two different containers exist is mostly a matter of memory efficiency and semantic purity.

Now back to Perl: Your hashes won't be growing too big, so memory efficiency probably isn't an issue, and we don't need to be too particular about semantic purity. Hashes often may be used where you might think in terms of sets.

my %heros;
@heros{ qw( thetick wonderwoman batman superman spiderman ) } = ();
print "Yes!\n" if exists $heros{thetick};
[download]

In the code above we're creating a hash called %heros that contains elements named for various superheros. But we don't explicitly assign a value to each of the elements. Their value is undefined, and it really doesn't matter because we never use it. Later we test to see if 'thetick' is among our set of superheros.

Dave

[reply]
[d/l]
[select]

Re: Trying to understand hashes (in general)
by FloydATC (Deacon) on Dec 23, 2014 at 08:30 UTC

I am trying my best to understand exactly how to add just a key without a value to a hash.

Although not actually very useful in the real world, this can be accomplished by simply assigning undef to a hash key. The key will then exist:

my %hash = ();
$hash{'foo'} = undef;

if (exists $hash{'foo'}) {
  print "The key 'foo' exists.\n";
} else {
  print "The key 'foo' does not exist.\n";
}

if (defined $hash{'foo'}) {
  print "The key 'foo' is defined.\n";
} else {
  print "The key 'foo' is undefined.\n";
}

if ($hash{'foo'}) {
  print "The key 'foo' is true.\n";
} else {
  print "The key 'foo' is false.\n";
}
[download]

What I am going to eventually shoot for, is making a hash of hashes

To accomplish this, you would make the value of your outer hash (of hashes) a reference to the inner hash, like so:

my %inner_hash = ( dir => '/tmp/foo/', filename => 'bar.txt' );
my %hash_of_hashes = ();

$hash_of_hashes{'baz'} = \%inner_hash; # Backslash = reference to

print "The filename associated with 'baz' is "  .$hash_of_hashes{'baz'
+}->{'filename'} . "\n";
[download]

Or you could use references all the way to begin with: (Notice the curly brackets)

my $entry = { firstname => 'Ola', lastname => 'Nordmann' }; 
my $staff = {};

my $id = 123;
$staff->{$id} = $entry;

printf(
  "%d: %s, %s\n", 
  $id, 
  $staff->{$id}->{'lastname'}, 
  $staff->{$id}->{'firstname'}
);
[download]

-- FloydATC

Time flies when you don't know what you're doing

[reply]
[d/l]
[select]

Re^2: Trying to understand hashes (in general)

by james28909 (Deacon) on Dec 23, 2014 at 11:30 UTC

Thanks for the thorough examples.

[reply]

Re: Trying to understand hashes (in general)
by Rosema1 (Initiate) on Dec 26, 2014 at 01:41 UTC

$dirs{$file}=$file

$dirs{$file}=''

[reply]
[d/l]
[select]