country1 has asked for the wisdom of the Perl Monks concerning the following question:


I have a CSV file that might have missing values, such
as:


File A


CSFDATVM01,,2.55,3.77


I am using the following SPLIT function to read the
CSV file containing this line along with other lines
that do not have missing values.


my ($server, @data) = split /,/;


For this particular line in the CSV file, if I
print $data[0], perl will print a blank ' '.


How would I go about converting this blank to
the value '0'?

Replies are listed 'Best First'.
Re: CSV Files With Missing Values
by Tux (Canon) on Jul 23, 2007 at 12:21 UTC

    First of all, don't use split for parsing CSV files. It's bound to explode in your face. Search for CSV in the advanced search to see how many others already found out. Instead, use Text::CSV_XS or Text::CSV_PP for parsing the data

    CSV data has by defenition no type, but with Text::CSV_XS, you can force it to have a type.

    Your code should be something like:

    use strict; use warnings; use IO::Handle; use Text::CSV_XS; open my $fh, "<", "data.csv" or die "data.csv: $!"; my $csv = Text::CSV_XS->new ({ binary => 1 }); while (my $row = $csv->getline ($fh)) { my ($server, @data) = @$row; $data[0] ||= 0; # ... more processing } close $fh;

    Enjoy, Have FUN! H.Merijn
Re: CSV Files With Missing Values
by oxone (Friar) on Jul 23, 2007 at 12:25 UTC
    Hi. I second previous suggestion that parsing a CSV row 'by hand' is not a good idea, unless you're *sure* there won't be any funny business in the source file (ie. quoted values, commas as part of values).

    If you are sure, you can do it 'your' way by amending your split to include the following map to can convert any empty strings to zeroes on the way through.

    my ($server, @data) = map { $_ eq '' ? 0 : $_; } split /,/;
    The contents of the map is a simple ternary which tests each value for an empty string, replaces it with zero if it matches, or else passes the value through unmolested otherwise.

    If your source data might contain trailing empty fields, and you don't want split to silently ignore them, you might also want to give split a limit of -1, like so:

    my ($server, @data) = map { $_ eq '' ? 0 : $_ } split /,/, $_, -1;
Re: CSV Files With Missing Values
by Tux (Canon) on Jul 23, 2007 at 12:30 UTC

    I couldn't resist implementing the `real-life-example':

    #!/pro/bin/perl use strict; use warnings; use IO::Handle; use Text::CSV_XS; open my $fh, "<", "data.csv" or die "data.csv: $!"; my $csv = Text::CSV_XS->new ({ binary => 1, eol => "\r\n" }); $csv->types ([ Text::CSV_XS::PV (), # Server Text::CSV_XS::IV (), # Count ]); while (my $row = $csv->getline ($fh)) { my ($server, @data) = @$row; $csv->print (*STDOUT, $row); } close $fh;

    Note however, the you still got the warnings:

    # cat data.csv CSFDATVM01,,2.55,3.77 # perl data.pl Argument "" isn't numeric in subroutine entry at xx.pl line 17, <$fh> +line 1. CSFDATVM01,0,2.55,3.77 #

    But at least, you reached your goal in a reliable way


    Enjoy, Have FUN! H.Merijn
      What's causing that warning?

        use warning; does, but it has nothing to do wit CSV

        # perl -wle'print ""+0' Argument "" isn't numeric in addition (+) at -e line 1. 0 #

        On a more important note, the original data reveiled an error in Text::CSV_XS, which I just fixed in 0.31, which has already been sent to CPAN.


        Enjoy, Have FUN! H.Merijn
        Hmm ok, but I'm still interested in how to solve that warning. It's a tough sell to recommend code that gives warnings especially when you can't understand why.
          A reply falls below the community's threshold of quality. You may see it by logging in.
Re: CSV Files With Missing Values
by Argel (Prior) on Jul 23, 2007 at 20:05 UTC
    There is an actual specification for CSV files known as RFC 4180. That's why people are recommending you use something like Text::CSV_XS.
Re: CSV Files With Missing Values
by mjscott2702 (Pilgrim) on Jul 23, 2007 at 19:31 UTC
    Without getting into all the details of parsing CSV files, which others have already mentioned here, you could use:
    print $data[0] || "0";
    or
    print $data[0] =~ m/^\s*$/ ? "0" : $data[0];

    The first will work if the value of $data[0] is missing, the second if it is empty or just some whitespace.