Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^3: Threads From Hell #2: How To Parse A Very Huge File

by karlgoethebier (Abbot)
on May 25, 2015 at 00:07 UTC ( [id://1127629]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Threads From Hell #2: How To Parse A Very Huge File
in thread Threads From Hell #2: How To Search A Very Huge File [SOLVED]

"...User code time... + system code time... = real time..."

Yes:

From the docs:

"The time utility executes and times the specified utility. After the utility finishes, time writes to the standard error stream, (in seconds): the total time elapsed, the time used to execute the utility process and the time consumed by system overhead."

Some observations:

karls-mac-mini:monks karl$ ls -hl very_huge10GB.file -rw-r--r-- 1 karl karl 10G 25 Mai 00:53 very_huge10GB.file karls-mac-mini:monks karl$ time grep karl very_huge10GB.file nose cuke karl nose cuke karl nose cuke karl nose cuke karl nose cuke karl real 2m42.126s user 0m20.437s sys 0m5.645s

 

karls-mac-mini:monks karl$ ./mce_loop.pl nose cuke karl nose cuke karl nose cuke karl nose cuke karl nose cuke karl Took 150.555 seconds

 

#!/usr/bin/env perl use Time::HiRes qw( time ); use feature qw(say); my $start = time; say qx (grep karl very_huge10GB.file); printf "Took %.3f seconds\n", time - $start; __END__ karls-mac-mini:monks karl$ ./wrap.pl nose cuke karl nose cuke karl nose cuke karl nose cuke karl nose cuke karl Took 157.265 seconds

For the grep example 60+60+42=162 which is 2m42s. But user+sys (20+5) is 0m25s. What do i miss?

Perhaps it's too late tonight. Or too early in the morning?

Best regards, Karl

«The Crux of the Biscuit is the Apostrophe»

Replies are listed 'Best First'.
Re^4: Threads From Hell #2: How To Parse A Very Huge File
by marioroy (Prior) on May 25, 2015 at 00:28 UTC

    For the grep example 60+60+42=162 which is 2m42s. But user+sys (20+5) is 0m25s. What do i miss?

    The wait time is not computed in the time output. This demonstration is IO bound. Basically, CPUs idle when waiting for IO.

Re^4: Threads From Hell #2: How To Parse A Very Huge File
by BrowserUk (Patriarch) on May 25, 2015 at 01:00 UTC
    For the grep example 60+60+42=162 which is 2m42s. But user+sys (20+5) is 0m25s. What do i miss?

    My interpretation of those numbers is that difference of 137 seconds is the elapsed time when the processor is doing nothing (for this process) because the process is in the scheduler queue in an IO wait state, waiting for the disk.

    That's when the opportunities for saving through multiprocessing simply don't exist. (As I detailed in my first reply above.)

    It's also where marioroy's examples defy my analysis; because his system has the fastest IO rate I've ever seen. When my customer was planning to install PCIe SSDs in his server farm -- which seems like last week, but looking back was over 20 months ago -- the fastest commodity priced (he needed lots of them) cards available were Fusion-IO ioXtreme Pro 4-lane cards which were capable of something like 2.1Gbits/s (remember divide by at least 8 for GBytes/s). I guess that (somewhat) justifies the premium prices you pay for Apple hardware.

    Interpreting those real/user/sys numbers gets further complicated when the elapsed time is less than the combined user+sys time, which comes about when multiple cores are processing concurrently, thus the process is racking up 4 (number of cores) seconds of cpu for every one second of elapsed time.

    Then the waters get really muddy, when the IO waits on 4 cores, start balancing out the 4 seconds/second of cpu accumulations when there is processing to be done, and you end up with numbers that make it look like your sometimes-IO-bound/sometimes-CPU-bound process is doing 1 for 1, cpu to real seconds; but actually requires the use of 4 cores to achieve it.

    Do you remember when I said a few days ago that it was very hard to draw generic conclusions about multi-threading ... :)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked
      "...the elapsed time when the processor is doing nothing..."

      Yes. I asked the oracle:

      "The term "real time" in this context refers to elapsed "wall clock" time, like using a stop watch. The total CPU time (user time + sys time) may be more or less than that value. Because a program may spend some time waiting and not executing at all (whether in user mode or system mode) the real time may be greater than the total CPU time. Because a program may fork children whose CPU times (both user and sys) are added to the values reported by the time command, but on a multicore system these tasks are run in parallel, the total CPU time may be greater than the real time."
      "...justifies the premium prices you pay for Apple hardware."

      Did you already order your first Mac ;-)

      "...Do you remember when I said a few days ago..."

      Indeed.

      Best regards, Karl

      Edit: Fixed typo.

      P.S.: I'm thinking already about episode #3 of "Threads From Hell".

      «The Crux of the Biscuit is the Apostrophe»

        Did you already order your first Mac

        I did say "somewhat".

        And no. There is nothing in this world that would persuade me to buy an unrepairable, unexpandable, fashion accessory computer with a 100%+ idiot tax.

        P.S.: I'm thinking already about episode #4 of "Threads From Hell"./i>

        Did I miss the 3rd installment?


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
        In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1127629]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2024-04-25 09:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found