in reply to Finding unique email addresses (was: Help!)

Since you told me in the Chatterbox that you have huge amounts of email-addresses, I'd suggest you use some kind of database. DBI could be used to save your data to a "real" database. If this is overkill, you might as well use a flatfile database, AnyDBM_File for example. As suggested above: You can use a hash as well, but then you might run out of memory..

Regards,
-octo

Replies are listed 'Best First'.
Re: Re: Finding Unique E-Mail addresses
by talexb (Chancellor) on Jul 29, 2002 at 13:38 UTC
    Great idea .. and you could even do something clever and separate the address at the '@' if you've seen that domain before and make the database store
    • domain -> foo.com
    • user -> james.barr:mark.bazz
    It all depends on how much RAM you have, how much DB space, and so forth.

    --t. alex

    "Mud, mud, glorious mud. Nothing quite like it for cooling the blood!"
    --Michael Flanders and Donald Swann

    Update: It's debatable if "using a database is an overkill". The data file is 300M daily, which is getting large. You could use a temporary database table and have MySQL worry about optimization and memory management rather than rely on one of the DBD modules.

      I think using a database is an overkill, why not simply use a hash tied to a DB file?
      It's much easier than installing a Database and DBI modules to suit.
      It'll most likely faster due to the simple nature of the datastructure.

      --

      Brother Frankus.

      ¤