in reply to Re^23: Strange memory leak using just threads (forks.pm)
in thread Strange memory leak using just threads

This takes 5.25 seconds on my machine (4GB, 4 cores, Linux).

By "This", I assume you mean this:

use 5.010; use Time::HiRes qw(time); my $data = 'x' x 500 * 1024**2; my $t = time; for my $n ( 1 .. 100 ) { unless (fork) { substr( $data, 4096 * $_ + $n, 1 ) |= 1 for 0 .. 124; exit; } } waitall; say time - $t;

Sounds good, but it isn't doing what you think its doing, nor what I intended, because of uncorrected typos in the (untested) example that you've "ignored". So, lets correct the deficiencies in the forked code:

I bet if you put back the strict and warnings, and fix the errors so that it is actually doing something:

<code> use strict; use warnings; use 5.010; use Time::HiRes qw(time); my $data = 'x' x 500 * 1024**2; my $t = time; for my $n ( 1 .. 100 ) { unless (fork) { substr( $data, 4096 * $_ + $n, 1 ) |= chr(1) for 0 .. 124; exit; } } waitall; say time - $t;

it will take a little longer. (I'll have to sort out and fire up my Unbuntu VM to be sure.)

The respective threads versions take: 164.842848062515 172.88907790184

Well yeah! If you use threads like fork, it will take a while.

For a start, you're running each of the 100 threads serially.

Second, you're building a new copy of the 500MB string in each thread, modifying 125 bits within it, and then discarding it.

The point of the exercise was to modify the same string, not 100 * 500MB copies. That's 100 times what the forked version is sharing! You're surprised that it is slow?

So, lets address those deficiencies of your threaded version:

#! perl -slw use strict; use 5.010; use threads ( stack_size => 4096 ); use Thread::Queue; use Time::HiRes qw(time); use constant DATA_SIZE => (500 * 1024**2); sub thread { my( $Q, $n ) = @_; $Q->enqueue( 4096 * $_ + $n ) for 0 .. 124; $Q->enqueue( undef ); } my $t = time; my $Q = new Thread::Queue; my @t = map threads->create( \&thread, $Q, $_ ), 1 .. 100; my $data = 'x' x DATA_SIZE ; for(1..100) { substr( $data, $_, 1 ) |= chr(1) while $_ = $Q->dequeue; } $_->join for @t; say time - $t; printf "%d bits changed\n", DATA_SIZE - ( $data =~ tr[x][] ); __END__ c:\test>junk44 2.03600001335144 12500 bits changed

So, 100 threads calculating modifications to 1/2GB of data, and 2.03 seconds.

Oh. And when you work out how to print last line of output, we can try for the real range of 0 .. 124_999 :)


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

Replies are listed 'Best First'.
Re^25: Strange memory leak using just threads (forks.pm)
by Anonymous Monk on Sep 23, 2010 at 23:27 UTC
    By "This", I assume you mean this: (...)
    No. Of course, I fixed the obvious issues first - in particular the precedence issue in my $data = 'x' x 500 * 1024**2;. Also, my perl doesn't have a waitall, so I wrote wait for 1..100. For the other irrelevant warnings (caused by 'x' |= 1) I just disabled warnings... because fixing this properly with |= chr(1) doesn't cause any significant runtime difference (still takes ~5 secs). Just for the record.
      because fixing this properly with |= chr(1) doesn't cause any significant runtime difference

      Hm. Seems strange. Without the fix, there should be no changes made to the COW memory. With it, should cause 125 * 4096 pages * 100 forks to be replicated.

      It's not a huge amount, but I'd expect to see some difference unless there is some other reason for it not to happen.

      Guess I'll just have to reinstall my *nix VM.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        ...there should be no changes made to the COW memory
        Why not? Both versions modify the string - just differently:

        $ perl -le '$data = "xxx"; substr($data, 1, 1) |= 1; print $data' x1x $ perl -le '$data = "xxx"; substr($data, 1, 1) |= chr(1); print $data' xyx