SoftwareGoddess has asked for the wisdom of the Perl Monks concerning the following question:

Greetings & Salutations:

Here is my 1st attempt at a Perl program, and I need guidance. I am trying to perform bandwidth analysis using mutiple data files. I am trying to place all the data points for the given sample set size(1st column of the Sample Files) into a singe row; and the columns would be indexed to the data set size. BTW, the second column in the sample file is the sum of all the data points in that line.

Within the code, you will see my feeble attempts to filled the data cube and read it back. Thanks

The desired output given the two example files would be:

3.011583 2.943297 2.940214 2.955738 2.968009 3.083707 2.958022 2.957737 2.982671 2.990788

6.249463 6.118992 6.124449 6.112282 6.133605 6.384345 6.079698 6.073674 6.108037 6.134851

12.254996 12.263543 12.277047 12.291713 12.196748 12.492582 12.137764 12.164201 12.074652 12.107912

24.024475 23.347609 23.463421 23.294582 23.33654224.010324 23.535210 23.562157 23.596123 23.675554

Sample file 1:

8 14.701072 3.011583 2.943297 2.940214 2.955738 2.968009

16 30.561410 6.249463 6.118992 6.124449 6.112282 6.133605

32 60.983738 12.254996 12.263543 12.277047 12.291713 12.196748

64 116.472912 24.024475 23.347609 23.463421 23.294582 23.336542

Sample file 2:

8 14.788687 3.083707 2.958022 2.957737 2.982671 2.990788

16 30.368369 6.384345 6.079698 6.073674 6.108037 6.134851

32 60.373260 12.492582 12.137764 12.164201 12.074652 12.107912

64 117.676048 24.010324 23.535210 23.562157 23.596123 23.675554

#!/usr/bin/perl -w use strict; use v5.10; use List::Util qw(min max sum); my @procBrdNum; my $numProcBrds; my $dataSetCnt; my @minArray = (); my @maxArray = (); my @avgArray = (); my @completeDataPts = (); my $numDataSets; my @dataSetSizes; my (@totalMin, @totalMax, @totalAvg, @totalThroughputArray); open my $ChassisConfigFile, "<ChassisTest.Cfg" or die ("Unable to open + file \n"); #Get slot numbers" while (my $slotLine = <$ChassisConfigFile>) { push(@procBrdNum,$slotLine); } close $ChassisConfigFile; $numProcBrds = @procBrdNum; printf("numProcBrds: $numProcBrds \n"); #Find and store the local minimimum & maximum throughput per board & t +hen the lowest min and max per board for(my $brdCnt = 0 ; $brdCnt < $numProcBrds; $brdCnt++) { my $T = $procBrdNum[$brdCnt] ; $T =~ s/\R//g; #Remove line break - not sure why there is a line + break? my $TestFileName = "Slot". $T ."Simplex.cfg.data"; printf("Board $procBrdNum[$brdCnt] File Name: $TestFileName \n"); + open my $dataFile, "<$TestFileName" or die ("Unable to open file + \n"); $dataSetCnt = 0; while (my $dataLine = <$dataFile>) { $dataLine =~ /^#/ and next; my @dataPts = split(/\s+/, $dataLine); my $dataSetSize = shift(@dataPts); if($brdCnt == 0) { push(@dataSetSizes, $dataSetSize); } my $totalThroughput = shift(@dataPts); my $min = min(@dataPts); my $max = max(@dataPts); my $avg = scalar @dataPts ? (sum(@dataPts) / (scalar @dataPts +)) : 0; push @{ $totalThroughputArray[$dataSetCnt]}, $totalThroughput +; push @{ $minArray[$dataSetCnt]}, $min; push @{ $maxArray[$dataSetCnt]}, $max; push @{ $avgArray[$dataSetCnt]}, $avg; #BAD CODE HERE! #@{ $completeDataPts[$dataSetCnt]} = (@{ $completeDataPts[$da +taSetCnt]}, @dataPts); #push @{ $completeDataPts[$dataSetCnt]}, @dataPts; my $completeDataPtRef = \@{ $completeDataPts[$dataSetCnt]}; push @{ $completeDataPtRef}, @dataPts; $dataSetCnt++; } } $numDataSets = @dataSetSizes; #MORE BAD CODE HERE! #for(my $dataSetCnt = 0; $dataSetCnt < $numDataSets; $dataSetCnt++) my $completeDataPtRef = \@{ $completeDataPts[$dataSetCnt]}; my $numCmpDataPts = @$completeDataPtRef; print "CmpDataPts $numCmpDataPts \n"; print "@$completeDataPtRef \n"; #{ # print " @{ $completeDataPts[$dataSetCnt]} \n"; # my @oneLineDataPts = (); # push @oneLineDataPts,[@{ $completeDataPts[$dataSetCnt]}]; #my $numCmpDataPts = @{ $completeDataPts[$dataSetCnt]}; #printf("Num of data Sets $numCmpDataPts \n"); #my @oneLineDataPts = { $completeDataPts[$dataSetCnt]}; #my $numOneLinePts = @oneLineDataPts; #printf("Num of data Pts $numOneLinePts \n"); # for(my $ptCnt = 0 ; $ptCnt < $numOneLinePts; $ptCnt++) # { # printf("DataValuesData Set Ct: $dataSetCnt, dataPt[ $ptCnt ] + = $oneLineDataPts[$ptCnt] \n"); # } #}

Replies are listed 'Best First'.
Re: Appending arrays into the rows of a 2 dimension array
by tangent (Parson) on Jun 03, 2014 at 21:01 UTC
    Here is a way to store all the information in a hash and deal with those new lines:
    use strict; use warnings; use Data::Dumper; use List::Util qw(min max sum); my %hash; my @files = ('file1.txt','file2.txt'); for my $file (@files) { open(my $fh,"<",$file) or die "$file: $!"; while (my $line = <$fh>) { chomp $line; next unless $line; my @items = split(/\s+/,$line); my $key = shift @items; my $total = shift @items; $hash{$key}{'sum'} += $total; push(@{ $hash{$key}{'points'} }, @items ); } close($fh); } for my $key (keys %hash) { my $points = $hash{$key}{'points'}; $hash{$key}{'min'} = min(@$points); $hash{$key}{'max'} = max(@$points); $hash{$key}{'avg'} = $hash{$key}{'sum'} / @$points; # to print the points print join(' ',@$points) . "\n"; } print Dumper(\%hash);
    Output for one of the keys:
    '8' => { 'sum' => '29.489759' 'min' => '2.940214', 'max' => '3.083707', 'avg' => '2.9489759', 'points' => [ '3.011583', '2.943297', '2.940214', '2.955738', '2.968009', '3.083707', '2.958022', '2.957737', '2.982671', '2.990788' ], },
Re: Appending arrays into the rows of a 2 dimension array
by poj (Abbot) on Jun 03, 2014 at 20:45 UTC
    #!perl use strict; my %data=(); my @data1 = ( '8 14.701072 3.011583 2.943297 2.940214 2.955738 2.968009', '16 30.561410 6.249463 6.118992 6.124449 6.112282 6.133605', '32 60.983738 12.254996 12.263543 12.277047 12.291713 12.196748', '64 116.472912 24.024475 23.347609 23.463421 23.294582 23.336542'); my @data2 = ( '8 14.788687 3.083707 2.958022 2.957737 2.982671 2.990788', '16 30.368369 6.384345 6.079698 6.073674 6.108037 6.134851', '32 60.373260 12.492582 12.137764 12.164201 12.074652 12.107912', '64 117.676048 24.010324 23.535210 23.562157 23.596123 23.675554'); for (@data1,@data2){ my ($sz,$sum,@f) = split /\s+/; push @{$data{$sz}},$_ for @f; } for my $sz (sort {$a <=> $b} keys %data){ print join ' ',@{$data{$sz}},"\n"; }
    poj
      Taking poj's nice code to use filehandle:
      use strict; use warnings; use autodie; my %data=(); my $file_one = 'test.txt'; my $file_two = 'test2.txt'; open ( my $fh1, '<', $file_one); open ( my $fh2, '<', $file_two); my @data1 = <$fh1>; my @data2 = <$fh2>; for (@data1,@data2){ my ($sz,$sum,@f) = split /\s+/; push @{$data{$sz}},$_ for @f; } for my $sz (sort {$a <=> $b} keys %data){ print join ' ',@{$data{$sz}},"\n"; }
      UPDATE: if the data files actually contain new lines, chomp($_) in the first for loop
        Hi PerlSufi, your code is very fine for the example, but it does not scale up very well for more than two files (and the OP mentioned that there are multiple data files).

        I would probably avoid storing each file into individual arrays, and try to process each file sequentially, and change the code to something like this (incomplete and obviously untested):

        use strict; use warnings; my %data=(); for my $file (qw /file_1.txt file_2.txt, file_3.txt ... file_n.txt/) { open my $FH, "<", $file or die "could not open $file $!"; while (<$FH>) { chomp; my ($sz,$sum,@f) = split /\s+/; push @{$data{$sz}},$_ for @f; } close $FH; } # ...
        Or possibly, if the files are passed as arguments to the script:
        use strict; use warnings; my %data=(); for my $file (@ARGV) { open my $FH, "<", $file or die "could not open $file $!"; while (<$FH>) { # ... } # ... } # ...
        or even (still assuming the files are passed as arguments):
        use strict; use warnings; use autodie; my %data=(); while (<>) { chomp; # ... } #...
        This latest solution might be used even if the argument passed to the script is not a list of files, but, say, the directory where they are stored:
        use strict; use warnings; use autodie; my $stat_dir = shift; my %data = (); { local @ARGV = glob ("$stat_dir/*.*"); while (<>) { chomp; # ... } # ... }
        Admittedly, the latest solutions look less robust and one might want to avoid them for production code. But are they really less robust? Hmm, if glob returns a list of files, then you basically know the files are there, the only thing that is lacking might be checking read privileges, no big deal.

Re: Appending arrays into the rows of a 2 dimension array
by hexcoder (Curate) on Jun 03, 2014 at 21:28 UTC
    Hi, I reduced the code to generate the expected output values, but tried to keep some of your code.
    #!/usr/bin/perl -w use strict; use warnings; use v5.10; my $completeDataPtRef; for my $slot (1 .. 2) { my $TestFileName = "Slot". $slot ."Simplex.cfg.data"; open my $dataFile, '<', "$TestFileName" or die ("Unable to open +file $TestFileName: $!\n"); my $dataSetCnt = 0; while (my $dataLine = <$dataFile>) { $dataLine =~ /^#/ and next; my @dataPts = split(/\s+/, $dataLine); my $dataSetSize = shift(@dataPts); my $totalThroughput = shift(@dataPts); push @{$completeDataPtRef->[$dataSetCnt]}, @dataPts; ++$dataSetCnt; } close $dataFile; } for my $linevalues (@{$completeDataPtRef}) { print "@{$linevalues}\n"; }
Re: Appending arrays into the rows of a 2 dimension array
by PerlSufi (Friar) on Jun 03, 2014 at 20:43 UTC
    Hello and welcome to the PerlMonks,
    One thing I would like to suggest is maybe next time do not post the commented out code that we do not need to use for trying your script. Just a short, easy to use exerpt is enough :)
    What are the purpose of the whole numbers at the beginning of each line in the files? Are those the line numbers?