Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^4: Threads From Hell #2: How To Parse A Very Huge File

by BrowserUk (Patriarch)
on May 25, 2015 at 01:00 UTC ( [id://1127634]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Threads From Hell #2: How To Parse A Very Huge File
in thread Threads From Hell #2: How To Search A Very Huge File [SOLVED]

For the grep example 60+60+42=162 which is 2m42s. But user+sys (20+5) is 0m25s. What do i miss?

My interpretation of those numbers is that difference of 137 seconds is the elapsed time when the processor is doing nothing (for this process) because the process is in the scheduler queue in an IO wait state, waiting for the disk.

That's when the opportunities for saving through multiprocessing simply don't exist. (As I detailed in my first reply above.)

It's also where marioroy's examples defy my analysis; because his system has the fastest IO rate I've ever seen. When my customer was planning to install PCIe SSDs in his server farm -- which seems like last week, but looking back was over 20 months ago -- the fastest commodity priced (he needed lots of them) cards available were Fusion-IO ioXtreme Pro 4-lane cards which were capable of something like 2.1Gbits/s (remember divide by at least 8 for GBytes/s). I guess that (somewhat) justifies the premium prices you pay for Apple hardware.

Interpreting those real/user/sys numbers gets further complicated when the elapsed time is less than the combined user+sys time, which comes about when multiple cores are processing concurrently, thus the process is racking up 4 (number of cores) seconds of cpu for every one second of elapsed time.

Then the waters get really muddy, when the IO waits on 4 cores, start balancing out the 4 seconds/second of cpu accumulations when there is processing to be done, and you end up with numbers that make it look like your sometimes-IO-bound/sometimes-CPU-bound process is doing 1 for 1, cpu to real seconds; but actually requires the use of 4 cores to achieve it.

Do you remember when I said a few days ago that it was very hard to draw generic conclusions about multi-threading ... :)


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
  • Comment on Re^4: Threads From Hell #2: How To Parse A Very Huge File

Replies are listed 'Best First'.
Re^5: Threads From Hell #2: How To Parse A Very Huge File
by karlgoethebier (Abbot) on May 26, 2015 at 10:35 UTC
    "...the elapsed time when the processor is doing nothing..."

    Yes. I asked the oracle:

    "The term "real time" in this context refers to elapsed "wall clock" time, like using a stop watch. The total CPU time (user time + sys time) may be more or less than that value. Because a program may spend some time waiting and not executing at all (whether in user mode or system mode) the real time may be greater than the total CPU time. Because a program may fork children whose CPU times (both user and sys) are added to the values reported by the time command, but on a multicore system these tasks are run in parallel, the total CPU time may be greater than the real time."
    "...justifies the premium prices you pay for Apple hardware."

    Did you already order your first Mac ;-)

    "...Do you remember when I said a few days ago..."

    Indeed.

    Best regards, Karl

    Edit: Fixed typo.

    P.S.: I'm thinking already about episode #3 of "Threads From Hell".

    «The Crux of the Biscuit is the Apostrophe»

      Did you already order your first Mac

      I did say "somewhat".

      And no. There is nothing in this world that would persuade me to buy an unrepairable, unexpandable, fashion accessory computer with a 100%+ idiot tax.

      P.S.: I'm thinking already about episode #4 of "Threads From Hell"./i>

      Did I miss the 3rd installment?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1127634]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-04-18 19:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found