in reply to Trying to understand hashes (in general)

Arrays are good for doing array stuff and hashes are good for doing hash stuff. If you set aside how they work under the hood arrays and hashes are nearly identical (in PHP essentially they are identical). The "difference" between arrays and hashes is that arrays are indexed by numbers and hashes are indexed by strings.

Arrays are really good when you have a list of things you want to store and either they naturally are keyed by a number, or have no key but may be ordered. It's really fast to access element in an array by their index (position in the array). Perl arrays are also very efficient at adding and removing elements at the start and end of the array. Arrays tend to be a poor choice if there are large gaps where index values have been skipped.

Hashes are really good where you want to access values by name. Note that there is no reason the name can't be a number so in that sense hashes and arrays can do the same job. Although hashes are pretty quick at looking up values by name, they aren't as fast as arrays. The other major difference is that hashes don't remember the order that elements were inserted so they generally can't be used in a trivial fashion to store ordered data.

Unless you are generating a set of unique values by taking advantage of the fact that hash keys are unique, it seldom makes sense to use a hash to just store keys so on the face of it a hash doesn't make sense for storing a file system's structure because the file system doesn't allow duplicated names in any case. Nested arrays are a much better fit for a file system's structure.

Perl is the programming world's equivalent of English
  • Comment on Re: Trying to understand hashes (in general)

Replies are listed 'Best First'.
Re^2: Trying to understand hashes (in general)
by james28909 (Deacon) on Dec 23, 2014 at 06:03 UTC
    It just seems that if I have a huge array of filenames and directories, and I want to compare another huge list of filenames and directories, it takes a long time with an array. Say for instance if I have a list of 2000 elements, and I want to compare that to another array that is 2000 elements, thats 4,000,000 iterations thru loops. Would hashes not be better for such a look up method? Maybe I am just putting to much emphasis on it, and the way I did it before is fine.

    There is still a good bit that I do not understand about hashes and seeing 10 different way on how to build them kind of confuse me. ;) But the code I posted above, why is it adding $file to both keys and values of the hash?

    Thanks for commenting btw :)

      What are you trying to achieve with the compare? Depending on the answer an array, a hash or a database may be a good answer, or maybe you don't need to store anything at all. In no case should you need nested loops that run across all combinations of element pairs however.

      There is no "one best solution" for all problems. Having a good understanding of what you are trying to achieve very often will point you toward the correct data structure and once you have the data structure right very often everything else just slots into place around it.

      Perl is the programming world's equivalent of English
        I have 20 folders in one directory ( we will call dir "a"), that contains sub dirs with about 1880 files in them and 20 folders in another directory(and this one "b") with close to the same amount of sub dirs and files. I need to loop thru both sets of directories (a and b) and see if any file/path name in (a) is in (b). that is where I am at. I am able to do it pretty easy with arrays, I just thought it would be faster to use hashes I reckon.