in reply to Creating a Hash using only one column in an imported data file

Hi Koda1234, welcome to the monastery and to Perl, the One True Religion.

In Perl to filter a list of values down to a smaller list of only the elements matching a certain condition, use grep.

use strict; use warnings; use feature 'say'; my @col9 = map { (split)[8] } <DATA>; foreach my $test ( 1, 9, 42, 666 ) { my $count = scalar grep { $_ >= $test } @col9; say sprintf "%d values were >= %d", $count, $test; } __DATA__ 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 42 10 11 12 1 2 3 4 5 6 7 8 42 10 11 12 1 2 3 4 5 6 7 8 42 10 11 12 1 2 3 4 5 6 7 8 1 10 11 12
Output:
$ perl 1181904.pl 7 values were >= 1 6 values were >= 9 3 values were >= 42 0 values were >= 666

See also:

Hope this helps!


The way forward always starts with a minimal test.

Replies are listed 'Best First'.
Re^2: Creating a Hash using only one column in an imported data file
by Koda1234 (Initiate) on Feb 13, 2017 at 20:53 UTC

    Thank you very much!! This was very helpful information. However, my data file is way too large to be able to use inside the code. I tried running it with the format below, and i'm getting 0 values are greater than x for all of the elements. Is there something wrong with the way i'm opening the file?

    open (IN, "<$ARGV[0]") || die ("Cannot open $ARGV[0]: $!"); @MyData = <IN>; use strict; use warnings; use feature 'say'; my @col9 = map {(split)[8]} <IN>; foreach my $test (2,3,4,5,6,7,8,9) { my $count =scalar grep {$_ >= $test} @col9; say sprintf "%d values were >= %d", $count, $test; }

      Hi Koda1234,

      The preferred (because it's safest ) way to open a file is the "three-argument form" (see open [suggestion: as a beginner, read the docs for the various functions; don't just copy examples you may see in the wild ]).

      Also:

      • Check that you got any input before using it.
      • $! will report the cause of open die-ing, but you can make your own check of the file so you can use your own error message.
      • Use while to read from your filehandle one line at a time, so even if it's big it won't fill your memory.
      • Use chomp to trim the newline character off the end of the line. Doesn't matter in your example, but it will soon enough...
      • Declare your array outside the while loop and use push to add the values to it as you split the lines.
      use strict; use warnings; use feature 'say'; my $filename = $ARGV[0] or die "You must supply a filename"; -f $filename or die "You must supply the name of a file that exists!"; open my $IN, '<', $filename or die "Can't open < $filename: $!"; my @col9; while ( my $line = <$IN> ) { chomp $line; push @col9, (split / /, $line)[8]; } close $IN or die "Can't close $filename: $!"; foreach my $test ( 1, 9, 42, 666 ) { my $count = scalar grep { $_ >= $test } @col9; say sprintf "%d values were >= %d", $count, $test; } __END__

      Hope this helps!


      The way forward always starts with a minimal test.

        Thank you again. I will have to get used to manipulating data sets. Also, when trying to run your code, I am getting a syntax error around say sprintf, although it it looks identical to the one before.