in reply to Efficiency and Large Arrays

Since any of the serial numbers could potentially be duplicated and you will change one of the duplicates to another arbitrary number it seems the actual number is not really important. Why not just reallocate the serial number on _every_ record as you read from top to bottom? Start from 1. This gets rid of the need to worry about duplicated serial nos and allows a single pass from top to bottom.

To handle duplicated phone_numbers, keep a hash of just the phone numbers (as key). When you read a record, if the phone number key is already in the hash don't print the record out, otherwise set the hash value for that phone no to 1 and print the record.

Replies are listed 'Best First'.
Reseting Serials
by gryng (Hermit) on Jul 23, 2000 at 18:18 UTC
    I agree with you, that's what I was mentioning in my post previously. You can also reduce memory further if you presort the data on phone numbers. Then you don't have to keep a hash at all.

    Ciao,
    Gryn