Okay. If anyone with perl v5.8.2 (AS 808) running under XP (or similar configuration) is following this discussion, could they please run the following code under these conditions.

  1. Download the code below and save as "buk.pl".
  2. Create a datafile of say 30MB. It doesn;t matter what it contains.
  3. Start the task manager and configure it with:
    1. Click the "Performance" tab and note how much Available Physical Memory your system has.
    2. Click the "Processes" tab.
    3. View->select columns...

      Ensure that "Memory usage", "Memory Usage Delta" & "Virtual Memory Size" columns are all checked.

    4. Ensure that all 3 columns are visible (preferably next to each other by temporarially unchecking any intermediate ones.
    5. Check View->Update speed->High.
    6. Check Options->Always on top.
    7. Adjust the task manager window to a convenient size and position so that you can monitor it whilst running the code.
    8. Click the "CPU" column header a couple of times to ensure that the display is sorted by cpu usage in descending order.
  4. Switch to a command line and run the program.

    buk datafile

Watch the 3 memory columns for perl.exe (should become the top item if you followed the above directins and don't have any other cpu intensive processes running) as the program runs.

Watch carefully, and note how the "Mem Usage" figure steadily rises for a short period before suddenly dropping back.

The "Mem Delta figure will become negative (the value displayed in braces) each time the "Mem usage" figure falls back.

Note that the "VM Size" value tracks the "Mem Usage" closely whilst being slightly larger, and grows steadily for a short period before falling back in step with "Mem Usage".

Note that each time it falls back it doesn't fall as far as it grew, resulting in an overall steady increase in the memory usage.

Note that the frequency and size of the fallbacks seems to grow ever larger, and more frequent with time.

Once you have seen enough, ^C the program.

Don't allow the "Mem Usage" value to approach the "Physical Memory Available" figure as by then you will have moved into swapping and the picture becomes confused as the OS starts swapping memory from other processors to disk and all the Mem Delta figures start showing up (negative)decreases.

I'd be really grateful if at least one other person could confirm that they too see the behaviour described.

#! perl -slw use strict; my @cache; open( FH, '< :raw', $ARGV[ 0 ]) or die $!; while( <FH> ) { push @cache, split '', $_; my $pair = shift( @cache ) . $cache[ 499 ] for 0 .. $#cache - 500; } close FH;

Assuming that this behaviour isn't a figment of my imagination and is confirmed by other(s), then if anyone has a better explaination of the (temporary, but often substantial) reductions in perl.exe's memory usage, other than Perl periodically freeing heap memory back to the OS as part of some "garbage collection like" process, I'm ready to eat my hat and apologise for misleading the monks.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

In reply to Re: Re: Re: Re: Optimising processing for large data files. by BrowserUk
in thread Optimising processing for large data files. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.