in reply to Re: Re: Re: Re: Reading file into a numbered hash
in thread Reading file into a numbered hash

Now I understand KM's point in the more general example, but in the specific case that started this thread, I'd like to suggest that Fastolfe's use of a simple array could get you the same readability as a hash, as the keys you need are always natural numbers. The matter of the zero-based array is handled by tossing in an unshift(@array,'');, following the read, after which $array[5] really refers to line 5 of the original file (and I also note that despite its good looks, using '05' as a hash key isn't going to help future maintainability).

Another benefit is simpler printing:

print @array; # gives the same result as: print qq{$hash{$_}} for sort keys %hash; # this

My conclusion (for now ;-): arrays are easier when your keys are always going to be sequential integers.

Replies are listed 'Best First'.
Re: Re: Reading file into a numbered hash
by KM (Priest) on Nov 23, 2000 at 19:50 UTC
    Another benefit is simpler printing:

    print @array; # gives the same result as: print qq{$hash{$_}} for sort keys %hash; # this

    No it doesn't. You are printing an array, and printing a hash by first sorting it's keys. They would not give the same result *except* for the case where the array is in the order you want (I hope if you read in a file, noone moved around elements!). So, they are the same, depending on how each was initialized.. but I may be nit picking :)

    The matter of the zero-based array is handled by tossing in an unshift(@array,'');, following the read

    Why would I want to do that? Now if I somewhere shift, pop, or otherwise alter the array (not the data set itself), things can still get out of whack.

    using '05' as a hash key isn't going to help future maintainability).

    No? If my file is a large data set (flat file db, maybe) then I think using '5' (remember, if you read my first post, the 0 padding was for nicer printing only) is very maintainable. Well, I don't see how it wouldn't be.. or how using an array would be any more maintainable.

    I am not disputing that an array is easy to work with (I like hashes better), isn't quick (hash lookups aren't too slow), or easy to read (well, sometimes they aren't intuitive).. but the original snippet was, again, AWTDI. Maybe at some point we can actually benchmark it with a large file. But, personally, I still like hashes in general for readability, maintainability, form, and function.

    Cheers,
    K "gobble gobble" M

      I often find a different tradeoff from you.

      Like you I agree that if you want to start accessing things by index, a hash is better than an array. Named keys are easier to maintain and less susceptible to breakage than indexes. Indeed just being able to give things meaningful names is a big enough win that I often dump @_ into a hash and start processing that instead of using arrays.

      That said, I usually prefer to arrange through basic array operations (shift, unshift, pop and push) along with Perl's native looping operations and generally list-oriented goodness to never care what the index is. If that is plausible then arrays win hands down in my books. They are faster. Easier to manipulate. Less room for error. Less typing.

      And for large files I am likely to see if there is a stream-oriented approach. Why have to read the whole thing in to start processing? (While I don't generally deal with terabytes of data, I have friends who do and they really care about this.)

      So I am a bit of a chameleon. There are problems which each is right for and I switch between them often and easily.