My guess is also that your memory consumption leads to excessive swapping.

The general idea is presorting, for instance by iterating multiple times over the file and only processing one IP range after the other.

This is expensive in IO but will create smaller data structures.

Only if ...

... you really need all the data present in memory at once, consider breaking up the ranges into a tree of nested data structures and processing them in linear order.

like $hash->{'192'}{'168'}{'101'}{'208'} or $hash->{'192.168'}{'101.208'} instead of $hash->{'192.168.101.208'} °

If you now process all IPs in order , then Perl (well the OS) will be able to swap all memory-pages with unrelated sub-hashes out. This will be cheap because the number of swaps is minimized by the sorting. (see also Re: Small Hash a Gateway to Large Hash? )

An additional approach is using more compact data structures, hashes are efficient for sparse data. But if your IPs range from 0-255 an array is certainly more efficient.

Furthermore, there is no point in repeating URLs like "logmeinrescue.com" in your array, counting them is more memory efficient.

Anyway 800k lines input doesn't sound heavy though, not sure if we have the full picture (???)

edit

Like choroba already said, preloading the input completely into memory sounds like a waste of resources, you should check how much that costs.

OTOH if you decided to implement my initial idea to process one IP range after the other, it'll reduce IO if (and only if) all fits into memory.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

°) I'm aware that 192.168.*.* is very common


In reply to Re: Hash Search is VERY slow by LanX
in thread Hash Search is VERY slow by rtjensen

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.