Re^3: Optimising a search of several thousand files

Urk! What is that seek doing in there? You already have the line in $data and the start point in $pos. (split /\t/,substr $data, $pos)[0,3] ought do the job. It may be faster to constrain split to just finding the first 4 elements, but I'd have to benchmark that.

DWIM is Perl's answer to Gödel

Comment on Re^3: Optimising a search of several thousand files Select or Download Code

Replies are listed 'Best First'.
Re^4: Optimising a search of several thousand files by McDarren (Abbot) on Jan 29, 2007 at 09:33 UTC
heh, well I did say that my implementation may have been a bit wonky. Anyway, I re-worked it as you suggested, and interestingly it is now significantly slower... `$ time ./gfather.pl dump.1167332700 McDarren 71 Processed 9098 files (total files:57912) 56.07 real 4.10 user 0.97 sys $ time ./gfather.pl dump.1167332700 McDarren 71 Processed 9098 files (total files:57912) 51.58 real 4.15 user 0.91 sys` [download] The re-worked section of the code looks like so: `... undef $/; my $data = <IN>; my $pos = index($data,'McDarren'); $/ = "\n"; next FILE if $pos == -1; # seek(IN, $pos,0); # chomp(my $line = <IN>); my ($user,$level) = (split /\t/, substr $data, $pos)[0,3]; ...` [download] Adding a limit to the split seems to improve things slightly.. `my ($user,$level) = (split /\t/, (substr $data, $pos),5)[0,3]; $ time ./gfather.pl dump.1167332700 McDarren 71 Processed 9100 files (total files:57914) 47.50 real 0.79 user 0.80 sys` [download] Not a proper benchmark, I realise. Actually, how would I go about benchmarking this?	[reply] [d/l] [select]
Re^5: Optimising a search of several thousand files by glasswalk3r (Friar) on Jan 29, 2007 at 13:05 UTC
Not a proper benchmark, I realise. Actually, how would I go about benchmarking this? See Benchmark and Devel::DProf. Alceu Rodrigues de Freitas Junior --------------------------------- "You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill	[reply]