Re: Arrays manipulation

You can also assign in 2d directly :-

 my $array2d = [];
 my @list = (split /,/,join (",",@array));
 $array2d->[$_%3][int($_/3)]=$list[$_] for (0..$#list);
[download]

or even

 my @array2d = ();
 my $c=0;
 $array2d[$c%3][int($c++/3)]=$_ for (split /,/, join (",",@array));
[download]

And let us compare with :-

#!/usr/bin/perl -w
use strict;

use Devel::Timer;
my $runcount=500;
my $t = new Devel::Timer();

my @array = ('nfs,7,rw',
             'afp,12,rro',
             'cifs,32,ro',
             'dns,5,rw',
);

$t->mark('V1');
for my $runs (1..$runcount) {
 my $cols = [];
 foreach my $row (0..$#array) {
         my @cols = split /,/, $array[$row];
         map {$cols->[$_]->[$row] = $cols[$_]} (0..$#cols);
 }
}

$t->mark('V2');

for my $runs (1..$runcount) {
 my @splitted_up = ();
 my $cnt = 0;
 push @{ $splitted_up[($cnt ++) % 3] }, $_ 
    foreach (split (/,/, join (",",@array)));
}

$t->mark('V3');

for my $runs (1..$runcount) {
 my $array2d = [];
 my @list = map {split /,/,$_}(@array);
 $array2d->[$_%3][int($_/3)]=$list[$_] for (0..$#list);
}
$t->mark('V4');

for my $runs (1..$runcount) {
 my @array2d = ();
 my $c=0;
 $array2d[$c%3][int($c++/3)]=$_ for (split /,/, join (",",@array));
}
$t->mark('V4 end');

$t->report();
[download]

And the Winner is :-

Devel::Timer Report -- Total time: 0.1663 secs
Interval  Time    Percent
----------------------------------------------
01 -> 02  0.0530  31.87%  V1 -> V2
03 -> 04  0.0443  26.65%  V3 -> V4
04 -> 05  0.0370  22.24%  V4 -> V4 end
02 -> 03  0.0319  19.15%  V2 -> V3
00 -> 01  0.0002   0.09%  INIT -> V1
[download]

The second suggested code posted performed best for my given hardward config, but always try speed tests on the correct hardware as OS's etc can effect performance.

  push @{ $splitted_up[($cnt ++) % 3] }, $_ 
    foreach (split (/,/, join (",",@array)));
[download]

Hope this helps
UnderMine

Comment on Re: Arrays manipulation Select or Download Code

Replies are listed 'Best First'.
Re: Re: Arrays manipulation by sauoq (Abbot) on May 01, 2003 at 21:20 UTC
The second suggested code posted performed best for my given hardward config, but always try speed tests on the correct hardware as OS's etc can effect performance. So does the current load on the machine... which is one reason why a "benchmark" such as the one you supplied is essentially useless. Not one of the methods you tested ran for longer than 6 hundredths of a second. That's simply not adequate. You will need a much larger dataset before you'll get any performance data that is even remotely meaningful. Many around here are happy to give advice to those obsessed with the performance of their code. That advice will almost certainly include statements like: "consider how long it takes to write as well as how long it takes to run" and "if you were really interested in performance you probably wouldn't be using perl in the first place." The upshot is that micro-optimizations simply aren't worth it. You are usually better off saving your time (or the maintainers) by writing clean, straight-forward code that is easy to read. Often enough, that approach leads to efficient code as well. When you really need better performance, you'll know it. Afterall, consider that the code in question probably won't spend more time running in the next 5 years than you've already spent benchmarking it... -sauoq "My two cents aren't worth a dime.";	[reply]
Re: Re: Re: Arrays manipulation by UnderMine (Friar) on May 06, 2003 at 14:30 UTC
I agree that the period the benchmark was run for was not sufficient for a more accruate estimate the benchmark should take longer (say 1 minute plus) and be run at an appropriate time. In the past I have run such tests using a scheduled job (say hourly) over a week to give an indication of when the system best copes with it. All Benchmarks are indicative and never give the whole story and you are right to point that out. But when we have no other way of testing the code (ie use Black box X or Y) then they do give us hints as which methods to investigate further. Benchmarking psudo-random samples is also useful when the whole dataset is massive. Hope it Helps UnderMine	[reply]