NYTAllergy has asked for the wisdom of the Perl Monks concerning the following question:

I'm running a pretty general 'log processing' Perl script, where I open log files one at a time, read the lines, parse them and update counters, then close and read the next log file. The hash I'm storing data in can grow over time, as new users are found and added to the tally list. The number of users will tend to grow quickly at first, then level out as the same users are seen over and over again.

The problem is that I seem to be hitting a limit where the script (nearly) grinds to a halt. It's still going... but very slowly. I'm almost certain that I'm looking at an OS or ActivePerl issue, but I'm querying here to try and make sure I'm not missing anything.

Consistently when I get to ~2GB worth of data read, the loop 'ticker' slows down to intermittent activity... nothing for minutes at a time, a burst of a few hundred lines read and processed, then nothing for a few more minutes.

This is a dual CPU Windows box. CPU utilization on processors 1 and 2 are almost *exact* mirror images of one another, with the signs reversed centering around 50%. In other words, if one is at 75% the other is at 25%. If one is at 80% the other is at 20%. Etc. This keeps average CPU utilization almost exactly on 50% (average 51.499%, max 100%). Over a period of 10 seconds a single CPU might jump from 85% to 25% to 70% to 40% to 60%... up / down pretty consistently.

At the same time, context switches/sec hold pretty consistently around 4000. But the max is ~7000.

Now all the while that this is going on I can see that the memory usage for the perl process is consistently growing. Sometimes 4 Kbytes, sometimes nothing for a few seconds, sometimes 100Kbytes per sec.

I can also see (by the ticker that I have in the script) that there activity in the script (i.e. the loop to read and process log lines) is very sporadic... it sits for several minutes at a time doing nothing, then processes a couple hundred lines, then does nothing for a few more minutes again.

The Perl process itself... interesting. % processor time is pretty well pegged at 99 to 100%. Which makes sense; it's a dual CPU box, the overall CPU usage is pretty firmly averaging 50%, therefore the process' % procesor time is going to be 100%. Handle count is solidly at 22. IO writes is 0. IO reads hover between 2 and 5, but average at 4. Page file bytes is pinned at 173,815,876... with an occassional jump in the max, where it then stays pinned at. Virtual bytes is pinned at 311,595,008, with an occassional jump in the max, where it then stays pinned at. Virtual bytes max (perhaps obviously) is the same as Virtual bytes, likeway page file bytes max. Pool page bytes and Non-pool page bytes are almost identical at ~ 1,332,000.

Now then... every once in a while (sometimes once every 2 minutes, sometimes twice in 30 seconds) there is a peak where context switching and CPU utilization on both switches maxes out. I cannot draw a direct correlation between that peak and subsequent process memory utilization or script activity; in fact, I've observed instances where the script kicks in just after, and instances where it just sits there. Doesn't seem to be a direct relationship.

Is this an obvious paging issue, or problem with ActiveState Perl? Or is this an obscure (or not so obscure, I don't claim to be a Perl expert) Perl issue? Either way, any suggestions on what to do about it (besides switching to a database backend)?

Any and all suggestions and comments are greatly appreciated.
  • Comment on Hitting memory limit? (ActiveState Perl)

Replies are listed 'Best First'.
Re: Hitting memory limit? (ActiveState Perl)
by davido (Cardinal) on Jan 22, 2004 at 05:25 UTC
    It's possible that you're encountering the practical limits of using flat files and in-memory hashes. Perhaps its time for a database approach. With a proper (and simple) database and some redesign of how you're doing things, you would never need to hold all users in memory at the same time, and wouldn't have to worry about running out of memory, even as the user list grows.

    Without seeing actual code, it's difficult to give any more specific tips, but in general, when you start running out of memory it's time to reconsider your design philosophy. Often this means taking that step into the world of databases.

    The DBI module, and DBD::SQLite can help to ease the pain of such a transition.

    Update: Corrected typo where DBD::SQLite was referred to as DBM::SQLite


    Dave

      In terms of moving to a database, but still staying within Perl, are the aforementioned DBI and DBM:SQLite the "best" options? Are there others that I should look at as well, to make sure that I'm using the interface / dbase that will work best for my particular use?
        A database solution allows for fast "random" read/write access to data. This can assist you in overcoming the need to build large in-memory hashes. This may be only one solution, but can be a powerful and highly expandable one.

        Your logfiles could even be written to databases, though that's probably less important than being careful to not try to slurp too much in at a time from your logfiles as you parse them. The key is to parse, and write out to the database without building up gigantic hashes of usernames and other data.

        To this end, DBI and DBD::SQLite are very good options. They will require that you learn some SQL. But DBD::SQLite is such a powerful (albiet lightweight) database that you are likely to not need anything heavier (such as Oracle) for a long time. DBD::SQLite is the database driver, and the database server wrapped into one module. DBI.pm is the "Database Interface" which allows you to write code in a database independant way, rather than getting down to the nitty-gritty of dealing directly with a low level database driver. You use the two together: DBI as the API to DBD::SQLite. I do consider them to be very good options if you're looking for ways to scale your project beyond its current capacity.


        Dave

Re: Hitting memory limit? (ActiveState Perl)
by BrowserUk (Patriarch) on Jan 22, 2004 at 06:56 UTC

    You've given minimal information in your script, but it sounds like your hitting the swapper.

    You neglect to mention how much memory is is use when the slow-down occurs, which is probably as important as the cpu utilisation. The "Physical Memory" and "Commit Charge" figures from the Perfromance tab of the Task Manager are your best indicators.

    Alternatively, if you have Win32::API::Prototype installed, you could use this script to log memory use along side your ticker, and get an indication of the memory used -v- performance.

    #! perl slw use strict; use Win32::API::Prototype; my $memoryStatus = AllocMemory( 32 ); ApiLink( 'Kernel32', 'GlobalMemoryStatus', 'P', 'V' ); GlobalMemoryStatus( $memoryStatus ); my @keys = qw[ length PercentageLoad TotalPhysical AvailablePhysical TotalPaged AvailablePaged TotalVirtualMemory AvailableVirtualMemory ]; my %mStatus; @mStatus{ @keys } = unpack 'V8', $memoryStatus; printf '%30s : %.2f ', $/, $_, $mStatus{ $_ } / 2**20 for @keys[ 1 .. $#keys ]; __END__ P:\test>gMem PercentageLoad : 0.00 TotalPhysical : 510.98 AvailablePhysical : 360.27 TotalPaged : 1250.05 AvailablePaged : 908.16 TotalVirtualMemory : 2047.88 AvailableVirtualMemory : 2018.61

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Timing (and a little luck) are everything!

      On one test box (this is my laptop, I took a copy home with me):

      Physical Memory
      -----
      Total: 261560
      Available: 7000 (and dropping)
      System Cache: 58088 (and climbing)


      Commit Charge (K)
      -----
      Total: 381188 (pretty static)
      Limit: 630652 (very static)
      Peak: 395832 (very static)

      Kernel Memory (K)
      -----
      Total: 43932 (very static)
      Paged: 35876 (very static)
      Nonpaged: 8056 (very static)

      I watched it until Physical Memory:Available got all the way down to ~3000. At that point Physical Memory:Available jumped to ~20000, and at the same moment Physical Memory:System Cache jumped by ~1000. Then they both started dropping/climbing as before.

      I will run with your Win32::API::Prototype script and post results at the first available opportunity (Friday, likely).

        Your Total Commited Charge far exceeds your Total Physical Memory--by about 50%. You've been into swapping for a considerable time.

        Although the TCC is fairly static, suggesting that the hash isn't growing much, each time you access a key within the existing hash, it's quite possible that that perl would have cycle through the entire hash structure to locate the next value, which in turn could mean the OS having to swap a huge portion of the process' image.

        You probably had trouble hearing your mp3 over the sound of the disc thrashing--your nieghbour's probably had the same problem:)

        Perls hashes are pretty memory hungry, and if you are nesting them, filling memory doesn't take too much effort. I induced this machine, with 512MB ram into swapping in 80 seconds with a hash containing a little under 6_000_000 keys with empty (undef) values. If your values are only simple scalars you'll get there much quicker. If they are themselves hashes or arrays, much quicker still.

        sub rndStr{ join '', @_[ map{ rand @_ } 0 .. shift ] }; $! = 1; ( $_ % 1000 or printf( "\r$_ : %s ", times ) ) and $h{ rndStr 8, 'a'..'z', 'A' .. 'Z', 0 .. 9 } = undef for 0 .. 10_000_000; 5858000 : 80.703

        You could find out how many keys it takes to induce swapping by disabling swapping and logging scalar keys %your_main_hash periodically as you fill it. If the last number displayed before you get perls "Out of memory" error is not to far short of your expected final size, then it might be worth looking at seeing how you could use less memory. If it is a long way short, then that effort is almost certainly not worth it.

        There are some alternatives to hashes which can save memory depending on the nature of your keys and what use you are making of the hash (see A (memory) poor man's <strike>hash</strike> lookup table. for one possible, part solution), but if you are regularly dealing with these volumes of data, a DB of some form is probably your easiest option.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        Timing (and a little luck) are everything!

Re: Hitting memory limit? (ActiveState Perl)
by Vautrin (Hermit) on Jan 22, 2004 at 12:58 UTC

    You're processing 2GB of Data, and it seems like you're storing whatever you're looking for in memory. Even assuming you only keep 1 / 10th of the data you process, you're still looking at 200MB+ in memory. Does your computer have that much free?

    Dan
      The initial test computer has 1GB RAM. The additional test computer (my laptop) has 256MB. Curiously, both start thrashing at approximately the same point in processing.

      I'm definitely going to follow up on the previous suggestions to try and narrow down the scope of the problem, and reduce the size of the overall data collection to make sure the basic script structure isn't a problem.
Re: Hitting memory limit? (ActiveState Perl)
by neuroball (Pilgrim) on Jan 22, 2004 at 04:52 UTC

    You gave a nice subscription of the how the computer reacts to your script. One problem though: To help you we need to take a look at your code

    Please edit your node, and add the script between <code> tags, so that the monks might help you with their knowledge, not their abbility to guess what might be wrong, or what might be re-written more efficiently

    /oliver/

      I hesitated to post code before because it's several hundred thousand lines (well, that and some of it I wrote on no sleep and I know blows chunks, aesthetics-wise ;). Would the preferred Monk-friendly approach be:

      (a) Post a URL
      (b) Post a snippet showing just the loop control and what I think are relevant portions (which may not, in fact, be relevant)
      (c) Something else entirely

      I am, as soon as I have a a system available to me, testing on a Linux box as well for comparison purposes. I will also be testing paring down the tallying piece and re-testing to confirm that the problem does not occur unless the hash is growing, and then add bits back in a bit at a time to see whether I can identify anything specifically problematic.