comment on

Unless you've shared @allips, then that array will be copied into the new thread.
That's pure overhead.

That isn't very accurate.

As can be easily seen, the overhead eliminated by using threads::shared isn't actually the lion's share of overhead in copying a large array to a new thread:

 $ alias tperl="time perl -w -Mthreads -Mthreads::shared"

# No data to copy:
 $ tperl -e'threads->create(sub{})->join() for 1..100'
 $ tperl -e'threads->create(sub{})->join() for 1..100'
CPU 0.680 secs (CPU 0.256 secs)

# Load the data late so also no data is copied:
 $ tperl -e'BEGIN{threads->create(sub{})->join() for 1..100}my @x=(1..
+35_000)'
CPU 0.728 secs (CPU 0.268 secs)
 $ tperl -e'BEGIN{threads->create(sub{})->join() for 1..100}my @x;shar
+e(@x);@x=(1..35_000)'
CPU 0.840 secs (CPU 0.364 secs)

# Overhead of copying, whether shared() or not:
 $ tperl -e'my @x;share(@x);@x=(1..35_000);threads->create(sub{})->joi
+n() for 1..100'
CPU 2.272 secs (CPU 1.316 secs)
 $ tperl -e'my @x=(1..35_000);threads->create(sub{})->join() for 1..10
+0'
CPU 3.304 secs (CPU 3.384 secs)
[download]

What threads::shared prevents from being copied to each new thread at thread creation time is the particular data for each element of the array. But you can see that it still includes much of the overhead (between 1/3 and 1/2 in the examples above) because it must copy the essentially-tied array (that uses C-based 'get' and 'set' accessors rather than the Perl accessors of actually-tied variables).

But if you actually expect the threads to make use of the shared data, then you can see the dramatic overhead of copying the data to each thread one-element-at-a-time with locking that threads::shared does:

 $ tperl -e'my @x=(1..35_000);threads->create(sub{for(@x){my$y=$_}})->
+join() for 1..100'
CPU 4.324 secs
 $ tperl -e'my @x;share(@x);@x=(1..35_000);threads->create(sub{for(@x)
+{my$y=$_}})->join() for 1..100'
CPU 17.281 secs
[download]

starting a new thread--a few milliseconds at most

If I didn't know better, I might suspect that this is propaganda. It could certainly mislead somebody. As can be seen above, creating a new iThreads instance can trivially take tens times that long.

Just for comparison, here is how extra data impacts the performance of fork():

 $ tperl -e'fork && exit for 1..100'
CPU 0.044 secs (CPU 0.020 secs)
 $ tperl -e'my @x=(1..35_000);share(@x);fork && exit for 1..100'
CPU 0.072 secs (CPU 0.072 secs)
 $ tperl -e'my @x=(1..35_000);for(1..100){if(fork){for(@x){my$y=$_};ex
+it}}'
CPU 0.084 secs
[download]

And, as has always happened, every time I touch iThreads, here are some examples of how easy it is to run into stupid things:

# Try to load the data late but mostly fail:
 $ tperl -e'threads->create(sub{})->join() for 1..100;my @x=(1..35_000
+)'
CPU 2.048 secs (CPU 1.204 secs)

# Note how easy it is to do it wrong and not share data but get the ov
+erhead:
 $ tperl -e'my @x=(1..35_000);share(@x);print "($x[0])\n";threads->cre
+ate(sub{})->join() for 1..100'
Use of uninitialized value in concatenation (.) or string at -e line 1
+.
()
CPU 3.380 secs (CPU 2.128 secs)
[download]

Update: There are a lot of numbers there. A nice summing-up is that sharing the data between multiple instances and actually using it in each instance (in the above example) is 200x (20,000%) slower using iThreads and threads::shared than when using native fork.

- tye

In reply to Re^2: Is Using Threads Slower Than Not Using Threads? (copying) by tye
in thread Is Using Threads Slower Than Not Using Threads? by Dru

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.