in reply to Re: Faster push and shift
in thread Faster push and shift

Actually, each of those active lines costs substantially. This is just 1M records matching the OPs data as simply as possible:

c:\test>junk91 junk.dat Bare loop ## baseline 0.322973012924194 0.358 0.078 0 0 c:\test>junk91 junk.dat ## +400% Add back: regex; 1.40799999237061 1.466 0.046 0 0 c:\test>junk91 junk.dat ## +80% Add back: regex; first push; 1.67599987983704 1.731 0.046 0 0 c:\test>junk91 junk.dat ## +400% Add back: regex; first push; second push; 2.94299983978271 2.932 0.124 0 0 c:\test>junk91 junk.dat ## +150% Add back: regex; first push; second push; shifts 3.2759997844696 3.369 0.015 0 0

The explanation is that no matter how little time something takes, if you do it a million times, it adds up.

In the OPs case, where he must be processing somewhere in the region of 2 or 3 billion records, it adds up big.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re^3: Faster push and shift
by rovf (Priest) on Feb 16, 2012 at 12:29 UTC
    Indeed ...

    And I find it interesting, that reading the data does not use that much time, compared to the other operations. I wouldn't have expected this, even if we take buffering into account.

    -- 
    Ronald Fischer <ynnor@mm.st>

      Quite obviously, BrowserUK very routinely processes gigantic datasets during the course of his work day.   He is quite the expert on those (what are to many of us...) edge cases.   Upvoted.