Re: Memory Restrictions

In addition to the other monks' advice, please take into consideration that hashes are not designed to be "conservative" with regard to memory consumption. They are designed to be fast. Perhaps the exception to this, is a relatively new optimization where the text for common hash keys is stored once IIRC.

A simple algorythm would be something along the lines of this untested, pseudo-code example:

my @uniques = ();

my $md5;

while (my $string = <FILE>) {
  $md5 = md5hex $string;
  if (grep { $md5 eq $_ } @uniques) {
    warn "$string is not unique\n";
    # or push() into another list...
  }
  else {
    push @uniques, $md5;
  }
}

# Now @unique contains the list of unique strings
[download]

This should use less memory than what I imagine your solution to be. Note that showing us some of your code can help us give better answers.

Update: Ok, ok. I added the obligatory MD5 :).

Regards.

Comment on Re: Memory Restrictions Download Code

Replies are listed 'Best First'.
Re: Re: Memory Restrictions by derby (Abbot) on Oct 24, 2002 at 12:46 UTC
please take into consideration that hashes are not designed to be "conservative" with regard to memory consumption quite true (and ++). perl will "over allocate" memory on the assumption you're always going to need more. If you know ahead of time (or can calculate at run-time), you can prevent the over allocation by preallocating memory (checkout perldata): `my @array; $#array = 512; # or my %hash; keys %hash = 512;` [download] but I don't think this is an issue with the original post. -derby	[reply] [d/l]

Replies are listed 'Best First'.

Re: Re: Memory Restrictions
by derby (Abbot) on Oct 24, 2002 at 12:46 UTC

please take into consideration that hashes are not designed to be "conservative" with regard to memory consumption

quite true (and ++). perl will "over allocate" memory on the assumption you're always going to need more. If you know ahead of time (or can calculate at run-time), you can prevent the over allocation by preallocating memory (checkout perldata):

my @array;
$#array = 512;

# or 

my %hash;
keys %hash = 512;
[download]

but I don't think this is an issue with the original post.

-derby

[reply]
[d/l]