Gtforce has asked for the wisdom of the Perl Monks concerning the following question:

I receive an array @result, which is my starting point. I think it is an array, but when I run this snippet of code to test the data that is in the array, it would appear that there's a scalar sitting in it. My end-purpose is to calculate the average of the last column, but can't figure out how. Any help is appreciated, thanks.

my $counter=0; foreach my $element (@result) { print "\nCHECKPOINT\n $result[$counter]\n"; $counter ++; }

Result:

CHECKPOINT

2017-08-01 20MICRONS 37744

2016-08-01 20MICRONS 25966

2016-04-20 20MICRONS 30807

2016-04-01 20MICRONS 32780

UPDATE: I don't mean to prolong this thread, but hopefully a quick question. My data set is about 100Mb with about 2k records and consequently ((2k-n)*n) loops, 'n' being my averaging period (loops because I'm doing a moving average to smoothen the data series). I tested my data set using a "WHILE", and then using a "FOREACH" (the 2 suggestions that were provided on this thread). Looping via foreach costs me less than a second, whilst looping via while costs several seconds (~5 seconds). I'm new to perl and am wondering if that's how it is meant to be, and consequently, what would be my use case for 'while'.

foreach (<myarray>) { my @dataset = split /blah/, $myarray[$outercounter]; for ($innercounter = 0; $innercounter < n; $innercounter++) { my $counter = $innercounter + $outercounter; my @miniarray = split /blah/, $myarray[$counter]; $sum += $miniarray[x]; } $outercounter ++; my $mavg = $sum / n; push (@resultset, "the things I need"); $sum = 0; $mavg = 0; last if $outercounter == scalar(@myarray); }

Replies are listed 'Best First'.
Re: Unscalar'ize a fake array
by Athanasius (Archbishop) on Sep 17, 2017 at 12:48 UTC

    Hello Gtforce,

    As 1nickt says, you need to determine the exact form of the data. Assuming it is a single string, with each “element” on a different line, you will need to split it on newlines to get a list of elements; then, for each element, you’ll need to split again, this time on whitespace, to break the line into its component fields. For example:

    use strict; use warnings; my $result = '2017-08-01 20MICRONS 37744 2016-08-01 20MICRONS 25966 2016-04-20 20MICRONS 30807 2016-04-01 20MICRONS 32780'; my @results = split /\n/, $result; my $total; print "CHECKPOINT\n"; for my $element (@results) { print "$element\n"; my @fields = split /\s+/, $element; $total += $fields[2]; } printf "Average: %.1f\n", ($total / scalar @results);

    Output:

    22:47 >perl 1821_SoPW.pl CHECKPOINT 2017-08-01 20MICRONS 37744 2016-08-01 20MICRONS 25966 2016-04-20 20MICRONS 30807 2016-04-01 20MICRONS 32780 Average: 31824.2 22:47 >

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Athanasius, agreed. Doing a split /blah1|blah2/ worked for the array I had. Many thanks.

Re: Unscalar'ize a fake array
by 1nickt (Canon) on Sep 17, 2017 at 12:36 UTC

    Hi, add this to the code instead of the snippet shown:

    use Data::Dumper; print '@result holds: ' . Dumper \@result;
    See Data::Dumper. For other techniques to make sure your data is what you think it is, see The Basic Debugging Checklist.


    The way forward always starts with a minimal test.

      lnickt, am now a Data::Dumper fan, thanks.

Re: Unscalar'ize a fake array
by kcott (Archbishop) on Sep 18, 2017 at 05:26 UTC

    G'day Gtforce,

    The statement "I receive an array" is vague. What precisely does that mean? Is it passed as an argument to a subroutine? Is it returned from a subroutine? Is it populated in the middle of some code (e.g. '@result = ...', 'push @result, ...', etc.)?

    I'd suggest you take a step back and look at how @result is generated. There could be a bug upstream which, when fixed, resolves this, and any other, downstream problems.

    I can really only speculate; however, purely as an example of one possible scenario, a change to $/, which hadn't been suitably localised, could cause this sort of situation.

    Test data:

    $ cat pm_1199549_test_file.txt 2017-08-01 20MICRONS 37744 2016-08-01 20MICRONS 25966 2016-04-20 20MICRONS 30807 2016-04-01 20MICRONS 32780

    Non-localised change (with respect to array population):

    $ perl -e 'local $/; @x = <>; use Data::Dump; dd \@x' pm_1199549_test_ +file.txt [ "2017-08-01 20MICRONS 37744\n2016-08-01 20MICRONS 25966\n2016-04-20 +20MICRONS 30807\n2016-04-01 20MICRONS 32780\n", ]

    Localised change (with respect to array population):

    $ perl -e '{ local $/; } @x = <>; use Data::Dump; dd \@x' pm_1199549_t +est_file.txt [ "2017-08-01 20MICRONS 37744\n", "2016-08-01 20MICRONS 25966\n", "2016-04-20 20MICRONS 30807\n", "2016-04-01 20MICRONS 32780\n", ]

    If you provide us with more information, we can probably offer better advice.

    — Ken

      Ken, thank you. The array was being populated somewhere at the beginning of the same code (no subroutines involved). I gave up struggling to understand/handling the delimiters and took a silly way out - I wrote the array into a file and read it back from file into memory - that helped lose my problems and get on with things. Am going to have to look through what I've done and what you've said, carefully (writing and reading back from a file is only a temporary thing that I'll eliminate shortly). Thank you once again.

Re: Unscalar'ize a fake array
by karlgoethebier (Abbot) on Sep 17, 2017 at 13:55 UTC

    For the fearless:

    #!/usr/bin/env perl # $Id: 1199549.pl,v 1.2 2017/09/17 13:01:45 karl Exp karl $ # http://perlmonks.org/?node_id=1199549 use strict; use warnings; use Data::Dump; use List::Util qw(sum); use feature qw (say); my $result = '2017-08-01 20MICRONS 37744 + 2016-08-01 20MICRONS 25966 + 2016-04-20 20MICRONS 30807 + 2016-04-01 20MICRONS 32780'; my @numbers = grep { /\d{5}/ } split /[\n,\s]/, $result; say sum (@numbers) / scalar @numbers; __END__

    Minor update: Fixed dumb typo in regex..

    Update: And see also Re^2: Best way to sum an array?...

    Regards, Karl

    «The Crux of the Biscuit is the Apostrophe»

    perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

Re: Unscalar'ize a fake array
by Marshall (Canon) on Sep 17, 2017 at 18:55 UTC
    Another slight variation. Use foreach my $line (@array){...} if you have an array like this.

    Update: I looked back at this thread and I admit to being flummoxed by what a "fake array" could be? Question: Do you have an array or not? That is a yes or no question. I'm not sure at all what a "fake" array could be? There could be a sequence of lines in a file. There could be multiple lines contained in a scalar text variable. I wouldn't characterize either of those situations as a "fake" array. I've worked with a lot of students in various programming languages, but I've never heard anybody refer to a "fake" array. You either have an array or you don't. If it is not an array, then there is a better more precise CS word for what this is.

    #!/usr/bin/perl use strict; use warnings; my $total = 0; my $num = 0; while (my $line = <DATA>) { my @tokens = split ' ', $line; # split removes line ending next unless @tokens == 3; # ignore blanks and header line $total += $tokens[-1]; # add last column $num++; } print "total = $total Average=",$total/$num,"\n"; # prints: total = 127297 Average=31824.25 __DATA__ CHECKPOINT 2017-08-01 20MICRONS 37744 2016-08-01 20MICRONS 25966 2016-04-20 20MICRONS 30807 2016-04-01 20MICRONS 32780

      Marshall, thanks for responding. I simply printed and eye-balled the array. Looking at it, I expected to be able to split by \t or whitespace, and it didn't split the array the way I hoped to - hence the frustration and reference.

        Well if you just looked at a printout, then this could have been a print out of single scalar text variable containing several lines. To split each line's values into an array, you have to "extract the lines" from the single text variable.

        Another fine point once you have extracted the lines...
        There are five white space characters, space,\t,\f,\n,\r.

        There are 2 ways to split on any of these white space characters.
        Perl has a special case, ' 'for the split.
        This is the same as the regex /\s+/ which splits on any of the five characters except in how it handles the first potentially "blank" field.

        Demo Code:

        #!/usr/bin/perl use strict; use warnings; my @lines = ("a b c\n", " a b c\n"); foreach my $line (@lines) { my @array = split ' ',$line; print join ("|",@array), "\n"; } foreach my $line (@lines) { my @array = split /\s+/,$line; print join ("|",@array), "\n"; } __END__ #using split on ' ' a|b|c a|b|c # using split on /\s+/ (the default) a|b|c |a|b|c