kevbot has asked for the wisdom of the Perl Monks concerning the following question:
I am using PDL in some modules, and I want to import numerical data from a text file using the rcols function found in PDL::IO::Misc. The rcols function will import the text file into piddles that correspond to the columns of the text file.
I stumbled across an issue when I was trying to import data from a tab-delimited text file. Some positions in my input data file will contain blank entries. It seems that handling of blank entries is inconsistent. If a blank entry is in the last column of the file then the $PDL::undefval is used in the piddle. If a blank appears elsewhere, then it appears that a value of "0" is used in the piddle.
Here is an example.
The data_missing.txt file is tab-delimited but contains some blank entries.1 6 11 2 7 12 3 8 13 4 9 14 5 10 15
I use the following script to test the contents of the pdls created by rcols:1 6 11 2 7 3 8 13 4 14 5 10 15
The output for <data.txt> is:#!/usr/bin/env perl use strict; use warnings; use PDL; use PDL::IO::Misc; my $file_name = shift; die 'No file given.' unless defined($file_name); open(my $fh, '<', $file_name) or die "Can not open file: $!"; my @pdls = rcols $fh, { COLSEP => "\t" }; foreach (@pdls) { print "$_\n"; } exit;
The output for <data_missing.txt> is:[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
So far, so good. However, if I change the value for $PDL::undefval, I get a strange result. First, the default value of $PDL::undefval is zero.[1 2 3 4 5] [6 7 8 0 10] [11 0 13 14 15]
Here is the code with $PDL::undefval set to -999.perl -MPDL -E 'say $PDL::undefval' 0
The output for <data.txt> is:#!/usr/bin/env perl use strict; use warnings; use PDL; use PDL::IO::Misc; my $file_name = shift; die 'No file given.' unless defined($file_name); open(my $fh, '<', $file_name) or die "Can not open file: $!"; local $PDL::undefval = -999; my @pdls = rcols $fh, { COLSEP => "\t" }; foreach (@pdls) { print "$_\n"; } exit;
The output for <data_missing.txt> is:[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
The value of $PDL::undefval is used in one case (where the 12 was deleted at the end of a row in the input file), but a zero used (where the 9 was deleted in the middle of a row in the input file).[1 2 3 4 5] [6 7 8 0 10] [11 -999 13 14 15]
This looks like a bug to me. Does anyone else have experience using this feature of PDL?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Unexpected behavior when using PDL::IO::Misc::rcols with $PDL::undefval
by kevbot (Vicar) on Mar 24, 2013 at 17:35 UTC | |
by etj (Priest) on May 22, 2022 at 17:58 UTC | |
by syphilis (Archbishop) on Mar 25, 2013 at 01:55 UTC |