Re: What could cause excessive page faults? (A fix)
by BrowserUk (Patriarch) on Mar 11, 2010 at 12:10 UTC
|
To whom it may concern.
Commenting out line 25 of win32\VMem.h fixes the problem of wildly excessive page faults, that are causing a quadratic slowdown on memory allocations under some circumstances.
/* vmem.h
*
* (c) 1999 Microsoft Corporation. All rights reserved.
* Portions (c) 1999 ActiveState Tool Corp, http://www.ActiveState.com
+/
*
* You may distribute under the terms of either the GNU General Pub
+lic
* License or the Artistic License, as specified in the README file
+.
*
* Options:
*
* Defining _USE_MSVCRT_MEM_ALLOC will cause all memory allocations
* to be forwarded to MSVCRT.DLL. Defining _USE_LINKED_LIST as well wi
+ll
* track all allocations in a doubly linked list, so that the host can
* free all memory allocated when it goes away.
* If _USE_MSVCRT_MEM_ALLOC is not defined then Knuth's boundary tag a
+lgorithm
* is used; defining _USE_BUDDY_BLOCKS will use Knuth's algorithm R
* (Buddy system reservation)
*
*/
#ifndef ___VMEM_H_INC___
#define ___VMEM_H_INC___
#ifndef UNDER_CE
//#define _USE_MSVCRT_MEM_ALLOC // <<<<<<<<<< HERE
#endif
With this fix, the OP snippet that takes 32+ seconds to run, now takes just 0.4 seconds:
Update: Should'a mentioned 37,000 page faults instead of 5,7 million. Memory consumption the same in both cases.
.\perl -MTime::HiRes=time -wE"my $t=time; my @a=map $_,map $_,1..1e6;s
+ay time-$t;<>"
0.392155885696411
I've tried to reason about the possible consequences of this change, but get lost in the layers upon layers of conditional redefinition, redirection and misdirection in the perl sources. It doesn't appear to cause any additional test suite failures, but then I seriously doubt if the appropriate circumstances are being tested anywhere.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
Somewhat related (or not) to using a different memory allocator than MSVCRT, Patch to make string-append on Win32 100 times faster. This change makes Perl grow/realloc strings exponentially geometrically instead of by a fixed amount and hence showed a speedup at least for Windows, by avoiding calls to realloc(). On Linux, at least with certain allocators, a slowdown was found. For BSD, Perl already uses its own allocator.
| [reply] [d/l] |
|
Many thanks for that. It looks like I'll be upgrading from 5.10 to 5.14 when it comes, in the hope that this patch addresses the OP problem.
Though I do wonder if my one-line patch wouldn't have fixed that too?
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
Re: What could cause excessive page faults?
by GrandFather (Saint) on Mar 11, 2010 at 08:26 UTC
|
From outside the box it looks like Perl is creating an intermediate list between the two maps and somewhere before 5e5 elements the system starts swapping. Before the swapping starts the page faults happen when more memory is needed to extend the lists giving about 3500 bytes per fault - close enough to a 4K page size perhaps. At some point the lists get too big to stay resident and the system starts swapping with the resultant increase in page faults (now against non-resident pages) and consequent increase in time (due to the page fetches from disk).
What seems very odd is that it is happening on a 64 bit system with (I presume) plenty of ram! However I get very similar results with a 32 bit Windows system and 5.10.1 btw.
True laziness is hard work
| [reply] |
|
There is no swapping involved. I have 4GB of ram and well over 2 of that was available at the point the scripts achieved maximum usage.
Page faults can also occur due to Windows 2-stage virtual memory allocation schema. Virtual memory can be 'reserved' by a process, without being 'commited'. Reservation means that space is reserved within the process address space and page tables within the OS internal structures, but no actual physical memory is yet assigned to back that reservation up. When access is first attempted to a 'reserved' page of virtual memory, a page fault occurs. and the page (or many) must be 'commited' before the memory access completes.
The way this normally works is, in the executable header, there are 2 values that are used to reserve stack and heap spaces for the process when it it loaded. There are 2 other values which define how much of the reservation gets committed each time a reserved page pagefault occurs:
C:\test>dumpbin /headers \perl64\bin\perl.exe
Microsoft (R) COFF/PE Dumper Version 9.00.21022.08
Copyright (C) Microsoft Corporation. All rights reserved.
Dump of file \perl64\bin\perl.exe
PE signature found
File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
8664 machine (x64)
5 number of sections
4B60BA96 time date stamp Wed Jan 27 22:13:42 2010
0 file pointer to symbol table
0 number of symbols
F0 size of optional header
23 characteristics
Relocations stripped
Executable
Application can handle large (>2GB) addresses
OPTIONAL HEADER VALUES
...
1000000 size of stack reserve
100000 size of stack commit
1000000 size of heap reserve
100000 size of heap commit
...
Note. The above values are non-standard as I have changed them in an attempt to track this down. The problem is, the above settings have no affect upon the outcome.
I think I've tracked the problem to <perlsources>\win32\vmem.h. Specifically,
# line 134
VMem::VMem()
{
m_lRefCount = 1;
InitializeCriticalSection(&m_cs);
#ifdef _USE_LINKED_LIST
m_Dummy.pNext = m_Dummy.pPrev = &m_Dummy;
m_Dummy.owner = this;
#endif
m_hLib = LoadLibrary("msvcrt.dll");
if (m_hLib) {
m_pfree = (LPFREE)GetProcAddress(m_hLib, "free");
m_pmalloc = (LPMALLOC)GetProcAddress(m_hLib, "malloc");
m_prealloc = (LPREALLOC)GetProcAddress(m_hLib, "realloc");
}
}
And I think this in the makefile is a contributary factor:
LIBC = msvcrt.lib
No proof yet. (I'm on my 3rd perl build; and geez they take a long time!) Just gut feel at this point. What I can say is that the problem still manifests itself when Perl and all its libraries are built with the same compiler (use the same CRT). And that using PERL_MALLOC doesn't change the situation.
However I get very similar results with a 32 bit Windows system and 5.10.1 btw.
Thanks for that. I don't suppose you have a AS 5.8.something install kicking around that you could try this on?
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
|
| [reply] |
|
Re: What could cause excessive page faults?
by Marshall (Canon) on Mar 11, 2010 at 03:11 UTC
|
I find this syntax for map confusing and always use the curly braces syntax. map{...}
perl -MTime::HiRes=time -wE"my $t=time; my @a=1..1e6;say time-$t;
is way faster than:
perl -MTime::HiRes=time -wE"my $t=time; my @a=map $_,1..1e6;say time-$
++t;
because the map generates an anon array that is an "extra step" and that array is copied to @a.
In your later code, you have a map within a map which is similar to a foreach within a foreach. So it is going to run like 1 million times slower.
On my machine...
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=(1..1e6);say
+ time-$t;"
0.0974979400634766
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map $_,1..1e
+6;say time-$t;"
0.337559938430786
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map {$_}1..1
+e6;say time-$t;"
0.339046001434326
| [reply] [d/l] [select] |
|
because the map generates an anon array that is an "extra step" and that array is copied to @a.
No array is created. One million scalars are created, but the question wasn't about that snippet. It was provided as a baseline.
In your later code, you have a map within a map which is similar to a foreach within a foreach. So it is going to run like 1 million times slower.
No, it's not multiplicative like a foreach in a foreach. It's additive like a foreach after a foreach.
my @array = map A, map B, LIST;
is functionally similar to
my @list1;
for (LIST) {
push @list1, B;
}
my @list2;
for (@list1) {
push @list2, A;
}
my @array = @list2;
Snippet three should take about 0.21 + (0.21-0.06) = 0.56, but it's taking 31.48 due to excessive paging. Why is it paging so much? | [reply] [d/l] [select] |
|
on my machine for this, I get:
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{$
+_}1..1e6;say time-$t;"
12.703125
which I think is similar to: my @array = map A, map B, LIST;
I am actually surprised that on a large 64 bit machine running some kind of *nix, that there are any page faults at all. I mean why does the "simple" version page fault? Sorry, I don't know.
Update: oh I see that this is Windows, I presume Win 7 instead of Vista? There are a bunch of versions of this OS, that might matter.
more tests on my 32 bit Win XP Pro machine 2 GB memory, AS 5.10.1:
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..100e3;say time-$t;"
0.193822860717773
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..200e3;say time-$t;"
0.609375
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..300e3;say time-$t;"
1.28125
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..100e3;say time-$t;"
0.18930196762085
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..250e3;say time-$t;"
0.921875
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..500e3;say time-$t;"
3.375
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..1000e3;say time-$t;"
13.15625
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{
+$_}1..2000e3;say time-$t;"
50.140625
Another "benchmark update":
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=1..16000e3;s
+ay time-$t;"
1.5
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}1..16
+000e3;say time-$t;"
5.546875
C:\Projects>perl -MTime::HiRes=time -wE"my $t=time; my @a=map{$_}map{$
+_}1..16000e3;say time-$t;"
3173.625
The maps above essentially don't do anything useful at all, but this shows the exponential increase in execution time on my 32 bit Win machine. So, I don't think this is specific to 64 bit machines. Apparently map uses more memory than one would think for a "do nothing" operation and also that Perl winds up accessing this extra memory in a way that causes a lot of page faults which would indicate that Perl is not cycling through sequential memory locations. Why that is and how that works, I don't know yet. But at least I can say this happens on 32 bit machines also.
| [reply] [d/l] [select] |
|
|
|