in reply to Re: How to split file for threading?
in thread How to split file for threading?

I seriously doubt that threading would do anything positive for you.

That's the trouble with not having the skills or capability to do your own research; you're always left regurgitating someone else's knowledge, even when it is out-of-date.

Here are the results of multi-threading a search of an 8GB file on a 4-core machine:

[20:39:41.83] C:\test>1131634.pl -T=1 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 207.819706917 secs cpu(162.296 15.906 0 0) [20:43:09.82] C:\test>1131634.pl -T=2 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 169.678593159 secs cpu(164.203 18.968 0 0) [20:46:00.25] C:\test>1131634.pl -T=3 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 156.497795820 secs cpu(164.375 18.656 0 0) [20:48:37.34] C:\test>1131634.pl -T=4 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 127.086592913 secs cpu(161.843 19.546 0 0) [20:50:45.04] C:\test>1131634.pl -T=5 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 115.240671158 secs cpu(161.89 19.109 0 0) [20:52:40.87] C:\test>1131634.pl -T=6 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 129.716307163 secs cpu(161.781 21.859 0 0) [20:54:51.19] C:\test>1131634.pl -T=7 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 167.007865906 secs cpu(162.328 22.171 0 0) [20:57:38.78] C:\test>1131634.pl -T=8 s:\1GBx8.bin lAxc Found 10 'lAxc' lines Took 179.142831087 secs cpu(164.171 25.546 0 0) 210 --- E 200 ... 165.0 l 190 ... 164.8 a 180 ... ___ 164.6 p 170 ... ___ X ... 164.4 C s 160 ... .X. --- .X. 164.2 P e 150 ... ... --- ... ... 164.0 U d 140 ... ... ... ___ ... ... 163.8 130 ... ... ... ... ... ... 163.6 S s 120 ... ... ... --- ... ... ... 163.4 e e 110 ... ... ... ... --- ... ... ... 163.2 c c 100 .X. ... ... ... ... ... ... ... 163.0 o o 90 ... ... ... ... ... ... ... ... 162.8 n n 80 ... ... ... ... ... ... ... ... 162.6 d d 70 ... ... ... ... ... ... .X. ... 162.4 s s 60 ... ... ... ... ... ... ... ... 162.2 . . 50 ... ... ... ... ... ... ... ... 162.0 40 ... ... ... .X. .X. .X. ... ... 161.8 30 ... ... ... ... ... ... ... ... 161.6 20 ... ... ... ... ... ... ... ... 161.4 10 ... ... ... ... ... ... ... ... 161.2 1 2 3 4 5 6 7 8 T H R E A D S

With 5 threads, it almost halves the time and and even reduces the overall cpu usage slightly.

The only question left is why do you bother to continue to play the Village Idiot™, and I think finally the answer to that conundrum is becoming self-evident.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

Replies are listed 'Best First'.
Re^3: How to split file for threading?
by graff (Chancellor) on Jul 04, 2015 at 13:14 UTC
    I'll admit to being a bit of a dinosaur myself on matters of multi-core machines, but even without understanding how such machines actually work, I always love a good benchmark test.

    Would you be able to share some more detail about how that chart was produced?

      Would you be able to share some more detail about how that chart was produced?

      By hand. I wanted to convey my point :)

      It's a style of hand-made chart I've used since I was at college, when "graphics" meant ascii-art.

      I guess it would be an interesting exercise to write a program to generate them; but with the other options available I doubt it would see much use.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!
        The layout, and the patience needed to render it via the keyboard, are impressive, but I presume that the numeric values came from a reasonably well-built benchmarking script, and if you could share (at least an outline of) that, it would be very helpful (... I think ... because, given my current ignorance about a multi-core environment, I assume that different conditions may yield different points at which adding more threads degrades performance).

        As for writing a program to produce that sort of chart (two data sets with distinct y-axes but a common x-axis), I expect that's already been done at least a few times (and I suppose most people would just load the data into MS-Excel to draw it in any number of different styles).

        UPDATE: Sorry... I see that you put some command lines above the chart, along with their numeric outputs, and I just now took the time to relate those outputs to the chart. So, just to clarify (because my brain isn't working all that well today)... are those command lines just running (a slightly modified version of) the OP script? Thanks.