in reply to Faster push and shift

Try this. It should complete much more quickly unless all the required records are at the beginning of the file:

#! perl -slw use strict; use Data::Dump qw[ pp ]; use Time::HiRes qw[ time ]; use File::ReadBackwards; tie *FH, 'File::ReadBackwards', $ARGV[0] or die $!; my $start = time; my( $last, @a, @b ); while( <FH> ) { /\d+\t(\d+)/; if( $1 > 3 ) { unshift @b, $1; unshift @a, $last; last if @b == 7; } $last = $1; } print time() - $start; print join ' ',times; pp \@a, \@b;

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Replies are listed 'Best First'.
Re^2: Faster push and shift
by rovf (Priest) on Feb 16, 2012 at 14:05 UTC
    I don't think your solution is equivalent to the original program. Your idea is to stop reading the file as soon as @b "has enough elements", i.e. you are interested in the first set of occurances of "suitable elements" in the file.

    The OP, however, throws out elements from @b (he is basically treating @b as a queue, where he pushes from one end and shifts from the other), so he is interested in the last set of occurances. That's why he has to process the whole file always.

    -- 
    Ronald Fischer <ynnor@mm.st>
      That's why use File::ReadBackwards;

        Ooops, I skipped this part. Great!!

        -- 
        Ronald Fischer <ynnor@mm.st>
Re^2: Faster push and shift
by locked_user sundialsvc4 (Abbot) on Feb 16, 2012 at 13:57 UTC

    What’s the difference here, BrowserUK?   (Really, that’s quite the sincere question.)   Kindly enlighten us all:   What is the key change from the OP that will make it faster, and how much faster do you predict it now will be?   ... And under what governing assumptions?

      how much faster do you predict it now will be?

      Doing it the OPs way on 1 million records takes 3.3 seconds:

      c:\test>junk91 junk.dat 3.25699996948242 ([1, 3, 4, 4, 1, 3, 2], [4, 4, 4, 4, 4, 4, 4])

      Same file doing it my way produces the same results in 1/2 a millisecond:

      c:\test>junk91-2 junk.dat 0.000479936599731445 ([1, 3, 4, 4, 1, 3, 2], [4, 4, 4, 4, 4, 4, 4])

      I make that 6,600 times faster. The OPs mileage my vary.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?