in reply to quickness is not so obvious
If I wrote some code and found that it ran much slower than I expected, I first ask myself "What the heck did I do wrong?". I'll then look at the code to see if I did something dumb.
If the code and algorithm look OK, I'll normally then check my assumptions and write a simple program to "look at" the data, but not do any processing, just to see what the fastest possible time could be. (So in this case, I'd have it simply open and read the 10,000 files, but not really look at the contents or do any writing.) Perhaps getting the data really does take that long. (It may be trawling through a remote filesystem with significant network latency, perhaps some of the data is archived on a tape system and it takes time for the robot to swap tapes to make the data available, etc.
So if there's little time difference in simply reading the data and doing the processing, I'll ask the operations team about improving performance. If the difference is *huge*, then it's time to profile your code and find out where it's spending all its time. Sometimes you'll find yourself doing something silly (oops! I'm opening and scanning the contents of file X inside of a loop where I'm processing file Y line-by-line). Sometimes you might find a regex that's performing poorly for some odd data. Or you may find that the algorithm you used is even less performant than expected, and you need to use/invent a better one.
As an example of the latter, I was surprised a few years ago when I found that the recursive method for computing fibonacci numbers was as slow as it is:
#!/usr/bin/env perl use strict; use warnings; use Time::HiRes qw( gettimeofday tv_interval ); for my $i (1 .. 100) { my $start = [gettimeofday]; my $fibonacci_number = fibo($i); my $interval = tv_interval( $start, [gettimeofday]); print "fib($i)=$fibonacci_number ($interval sec)\n"; } sub fibo { my $num = shift; return 1 if $num<3; return fibo($num-1) + fibo($num-2); }
I knew that this method wasn't terribly efficient, but I was still amazed at how slow it actually was:
$ perl fibo.pl fib(1)=1 (6e-06 sec) fib(2)=1 (4e-06 sec) fib(3)=2 (7e-06 sec) fib(4)=3 (6e-06 sec) fib(5)=5 (6e-06 sec) fib(6)=8 (1e-05 sec) fib(7)=13 (1.6e-05 sec) fib(8)=21 (2.4e-05 sec) fib(9)=34 (3.5e-05 sec) fib(10)=55 (5.6e-05 sec) fib(11)=89 (8.9e-05 sec) fib(12)=144 (0.000145 sec) fib(13)=233 (0.000291 sec) fib(14)=377 (0.000379 sec) fib(15)=610 (0.000609 sec) fib(16)=987 (0.000981 sec) fib(17)=1597 (0.001583 sec) fib(18)=2584 (0.002644 sec) fib(19)=4181 (0.00414 sec) fib(20)=6765 (0.005748 sec) fib(21)=10946 (0.004466 sec) fib(22)=17711 (0.007214 sec) fib(23)=28657 (0.011702 sec) fib(24)=46368 (0.018114 sec) fib(25)=75025 (0.027751 sec) fib(26)=121393 (0.045355 sec) fib(27)=196418 (0.072849 sec) fib(28)=317811 (0.117512 sec) fib(29)=514229 (0.193195 sec) fib(30)=832040 (0.307089 sec) fib(31)=1346269 (0.505429 sec) fib(32)=2178309 (0.796903 sec) fib(33)=3524578 (1.312573 sec) fib(34)=5702887 (2.093858 sec) fib(35)=9227465 (3.422112 sec) fib(36)=14930352 (5.594139 sec) fib(37)=24157817 (8.969398 sec) fib(38)=39088169 (14.46981 sec) fib(39)=63245986 (23.423682 sec) fib(40)=102334155 (38.201329 sec) fib(41)=165580141 (62.30559 sec) ^C
Yeah, like I'm going to sit through it computing the first 100 numbers that way. That's when I looked up Memoize and started using it. Which, by the way, is a very nice way to get a performance boost for some programs with little modification. I simply added the following two lines to the top of the program (just after the other 'use' statements):
use Memoize; memoize('fibo');
To get much better performance:
$ perl fibo.pl fib(1)=1 (8e-06 sec) fib(2)=1 (5e-06 sec) fib(3)=2 (1.2e-05 sec) fib(4)=3 (7e-06 sec) fib(5)=5 (6e-06 sec) fib(6)=8 (6e-06 sec) fib(7)=13 (6e-06 sec) fib(8)=21 (6e-06 sec) fib(9)=34 (5e-06 sec) fib(10)=55 (6e-06 sec) . . . <snip> . . . fib(98)=1.35301852344707e+20 (6e-06 sec) fib(99)=2.18922995834555e+20 (6e-06 sec) fib(100)=3.54224848179262e+20 (6e-06 sec)
Problem solved--Yeah, when you trawl through a deep binary tree *twice*for*each*level*, it's gonna get slow....*very*very*slow*. Lesson learned, moving on to next thing....
...roboticus
When your only tool is a hammer, all problems look like your thumb.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: quickness is not so obvious
by LanX (Saint) on Jan 23, 2015 at 13:34 UTC | |
|
Re^2: quickness is not so obvious
by DanBev (Scribe) on Jan 23, 2015 at 13:32 UTC | |
|
Re^2: quickness is not so obvious
by SimonPratt (Friar) on Jan 23, 2015 at 16:00 UTC | |
by LanX (Saint) on Jan 23, 2015 at 23:03 UTC | |
by GrandFather (Saint) on Jan 24, 2015 at 22:15 UTC | |
by SimonPratt (Friar) on Jan 26, 2015 at 10:57 UTC | |
|
Re^2: quickness is not so obvious
by Laurent_R (Canon) on Jan 24, 2015 at 16:04 UTC | |
by roboticus (Chancellor) on Jan 24, 2015 at 17:27 UTC | |
by Laurent_R (Canon) on Jan 24, 2015 at 22:32 UTC |