Data structure question: Directory-in-memory ?

mr.nick has asked for the wisdom of the Perl Monks concerning the following question:

Hi folks!

I would like to ask my fellow brothers and sisters how best to handle this situation. I want to scan a directory of files and store them in a hash in such a way as to fascilitate printing them out via nested HTML's <ol> tags. I have since figured out a way to get the results I want during printing, not data collection, but I still would like to know how to do it this way.

The structure of the directories look something like this.

  . 
  |
  |-- subdir 1
  |-+ subdir 2
  | |-- subsubdir 2.1
  | |-- subsubdir 2.2
  |-+ subdir 3
    |-+ subsubdir 3.1
      |-- subsubsubdir 3.1.1
      |-- subsubsubdir 3.1.2
      |-- subsubsubdir 3.1.3

(your standard directory tree). The problems I'm facing are twofold.

First of all, building the hash itself. I wanted to end result to be like this:

  $hash{subdir 1}
  $hash{subdir 2}{subsubdir 2.1}
  $hash{subdir 2}{subsubdir 2.2}
  $hash{subdir 3}{subsubdir 3.1}
  $hash{subdir 3}{subsubdir 3.1}{subsubsubdir 3.1.1}
  $hash{subdir 3}{subsubdir 3.1}{subsubsubdir 3.1.2}
  $hash{subdir 3}{subsubdir 3.1}{subsubsubdir 3.1.3}
[download]

So I thought I could code something along the lines

  @parts=split /\//,$dirname;
  %hash{ @parts } = (); ## this doesn't work, of course, 
                        ## just an example of the idea
[download]

where it results in something like $hash{ $parts[0] }{ $parts[1] }{ $parts[n] }=(), where { $parts[n] } would be repeated for each element of @parts. I couldn't think of a way to do this without knowing n, the index of the last element. I suppose I could do it easily enough with eval, but I try to avoid using eval as much as possible.

The second problem I have is having a hash value that can either be another hash key AND contain values. Eg, "subdir 1" can contain both files and sub directories. I know the following doesn't work:

  $hash{foo}=1;
  $hash{foo}{bar}=2;
[download]

you get a Can't use string ("1") as a HASH ref while "strict refs" in use barf from strict. But that's the effect I want.

Hm. On second thought, an anonymous array might work:

#!/usr/bin/perl

use strict;
use Data::Dumper;

my %hash;

$hash{foo}=[undef,1];
$hash{foo}[0]{bar}=[undef,2];
$hash{foo}[0]{bar}[0]{bink}=[undef,4];

print Dumper \%hash;
[download]

In fact, that does appear to be the effect that I want... (sorry, it came to me while I was writing this). Hm. Iterating through it won't be fun, though.

So, any thoughts on any of this? Comments about the klunky anonymous array solution?

mr.nick ...

Comment on Data structure question: Directory-in-memory ? Select or Download Code

Replies are listed 'Best First'.
Re: Data structure question: Directory-in-memory ? by stephen (Priest) on Jun 20, 2001 at 04:36 UTC
You could do the standard tree-thing and use a more standard linked_list solution. Something like: `my $dir = { name => '.', contents => [ { name => 'subdir 1', contents => [...] }, { name => 'subdir 2', contents => [...] } ], };` [download] Note: untested Then you'd have a real tree-structure, which you could iterate recursively. You don't get the ease of the quick solution, but you get some robustness and flexibility. Plus, you can wrap this in a class structure and write accessor methods. Best of all, you could use Class::Tree, which provides a premade OO interface to treelike structures of this kind... Haven't used it myself, but the docs list methods for reading directory trees directly from the filesystem. stephen	[reply] [d/l]
Re: Re: Data structure question: Directory-in-memory ? by mr.nick (Chaplain) on Jun 20, 2001 at 04:47 UTC
That's sort of the direction I was going with the anonymous array. I was trying to accomplish this using Perl internals; only relying on it's own hashing and array functions. Using a btree would be too easy :) mr.nick ...	[reply]
Re: Data structure question: Directory-in-memory ? by Anonymous Monk on Jun 20, 2001 at 13:54 UTC
How about storing your value in the hash with a name of '/'. '/' can be no part because you split on it. `$hash{foo}{'/'}=1; $hash{foo}{bar}{'/'}=1;` [download] and so on...	[reply] [d/l]
(tye)Re: Data structure question: Directory-in-memory ? by tye (Sage) on Jun 20, 2001 at 17:59 UTC
I like the idea of using '/' as the key for the data associated with the directory. You could also use the empty string as that key. Are you using File::Find to recurse the directory tree? If so, my first instinct is "stop". Getting File::Find to play nice with this kind of thing is a pain but rewriting File::Find's functionality such that you can pass along the proper hash-to-be-filled in the recursive step is pretty trivial. Or, if you want to impress your friends, read Re: How to map a directory tree to a perl hash tree and use its technique. (: - tye (but my friends call me "Tye")	[reply]