in reply to Re^5: Consumes memory then crashs
in thread Consumes memory then crashs
This is actually a good case in point of my earlier post on threads. In a way, threads are like gatling guns: if you're The Terminator and can handle them, they can be very effective; for most people however they provide a million opportunities for shooting themselves in the foot. Unlike gatlings the holes produced may be rather subtle though, and may appear after a long time of seemingly successful use---the typical heisenbugs that appear once in a while, but never while you look closely.
The problem here is that print is not atomic, in fact most of stdio is taboo in threaded code without further protective measures. A thread may be preempted after writing a fraction of a buffer and then resume after another thread has written to the same file. In your example that waits a lot between printing lines, the probability for this to happen is really very small, but that doesn't mean it can't happen to the first two lines of output. Here's a script that provokes it:
use strict; use threads; open my $fh, '>', 'outfile' or die $!; my $th = 0; my @threads = map { $th++; async( sub { sleep(1); for(1 .. 30_000) { print $fh "Thread $th\n" + } } ); } (1 .. 500); $_->join foreach @threads; close $fh;
Sample output snippet:
Thread 3Thread 349 Threead 85 Thread Thread 333 Thre59 ad 349 Thread 3ad 333 Thread 3Thread 359 Thre49 33 ad 295 ThreThread 333 Thread 8Thread 338 ThreThread 350
For an application like retrieving a large number of web pages where waiting for the other side is the major cause of delays (so spreading it out over multiple cores has no significant advantage), the solution of choice is the state machine. Event based programming may look like a lot of work to wrap one's head around but in the end it's easier to understand than threads if you consider all the rather lowlevely race conditions and other synchronization issues that you have to think about to write thread code that always works and not just most of the time.
Regarding modules to facilitate the implementation of said state machine, one I found easy to use (actually the only one I've ever used in production code) is POE::Component::Client::UserAgentPOE::Component::Client::HTTP. (edited, it's been a while but the name didn't sound quite right) POE is rather heavyweight though (not that it mattered much here) so AnyEvent::Curl::Multi might be worth a look too.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^7: Consumes memory then crashs
by davido (Cardinal) on Mar 24, 2012 at 21:20 UTC | |
|
Re^7: Consumes memory then crashs
by BrowserUk (Patriarch) on Mar 24, 2012 at 23:12 UTC | |
by zwon (Abbot) on Mar 25, 2012 at 05:34 UTC | |
by BrowserUk (Patriarch) on Mar 25, 2012 at 06:34 UTC | |
by chromatic (Archbishop) on Mar 25, 2012 at 07:57 UTC | |
by BrowserUk (Patriarch) on Mar 25, 2012 at 10:00 UTC | |
by BrowserUk (Patriarch) on Mar 25, 2012 at 10:39 UTC | |
by mbethke (Hermit) on Mar 25, 2012 at 07:24 UTC | |
by BrowserUk (Patriarch) on Mar 25, 2012 at 09:36 UTC | |
by mbethke (Hermit) on Mar 25, 2012 at 23:33 UTC | |
by BrowserUk (Patriarch) on Mar 26, 2012 at 03:56 UTC | |
|