in reply to RE: Re: Efficiency and Large Arrays
in thread Efficiency and Large Arrays

Also, what happens when you run the script the second time?

Yes, that's a problem, if the serial numbers need to be maintained. If this is a one-time-per-dataset operation, and the serial numbers are there just while manipulating the data, it doesn't really matter.

Anyway, his problem seems to lie on the memory use more than the numbering scheme.

But the reason he's keeping all the old records around is to make sure he doesn't reuse a number. If he uses a unique identifier (the reference value is unique, automatically generated, and readily available), he doesn't have to keep all of the records around in memory.

The thing that bothered me was using grep to look for already-used phone numbers. What if they were the primary key of the hash? Then, it's a simple lookup to see if one's already used.

  • Comment on RE: RE: Re: Efficiency and Large Arrays

Replies are listed 'Best First'.
RE: RE: RE: Re: Efficiency and Large Arrays
by fundflow (Chaplain) on Jul 23, 2000 at 05:19 UTC
    Having a hash instead of grepping is of course better. (although it takes more memory)

    The idea of using the memory reference returned by perl's internal heap mechanism is interesting, but i'm not sure it buys much here.

    Anyway, the original post is "walking on the edge" of usability. If his files are much bigger than the computer memory, then the hash will not fit in and then there are better ways, such as using database, doing multiple passes etc. (or keeping the files clean in the first place...)

    Cheers.