in reply to Re: Text::CSV_XS and blank lines
in thread Text::CSV_XS and blank lines

Would it help if I would add a feature that makes the number of fields parsed in the last line available?

Yes, it would indeed. With a field_count property, one could easily skip truly blank lines in the CSV file.

CSV_RECORD: while (my $value_of = $csv->getline_hr($csv_fh)) { next CSV_RECORD if $csv->{field_count} == 1; # ... }

This is more elegant than having to evaluate keys %value_of in a scalar context to determine the number of fields or, in the case of parse(), evaluating @values in a scalar context.

UPDATE: Erased the confused bit about evaluting keys %value_of in a scalar context, which wouldn't help.

Replies are listed 'Best First'.
Re^3: Text::CSV_XS and blank lines
by Tux (Canon) on Feb 04, 2011 at 18:51 UTC

    As the _hr variants are the odd one out in the code, counting fields inside the parser actually was kinda awkward. And the only way to do it reliable - as far as I could see in my first try - was definitely not doing any good to the parsing speed of "regular" parses. All the other parse methods return an araay or an array reference. That means that you can very easily check the length of the array to see how many fields were parsed.

    What I did instead, was this:

    is_missing my $missing = $csv->is_missing ($column_idx); Where $column_idx is the (zero-based) index of the column in th +e last result of "getline_hr". while (my $hr = $csv->getline_hr ($fh)) { $csv->is_missing (0) and next; # This was an empty line } When using "getline_hr" for parsing, it is impossible to tell i +f the fields are "undef" because they where not filled in the CSV str +eam or because they were not read at all, as all the fields defined by "column_names" are set in the hash-ref. If you still need to kn +ow if all fields in each row are provided, you should enable "keep_me +ta_info" so you can check the flags.

    Your constructor would then look somewhat like

    my $csv = Text::CSV_XS->new ({ auto_diag => 1, binary => 1, keep_meta_info => 1, });

    Tell me if that would work for you ... (BTW feel free to pull from here)


    Enjoy, Have FUN! H.Merijn
      Tell me if that would work for you ...

      Well, it would work for me, but I'm not the OP, constantreader. He or she wrote, "Ideally, it would be nice if it returned an empty hashref." But I think I understand why this wouldn't work as it requires distinguishing between a CSV record with a single empty field in it and an unwanted blank line, something that Text::CSV_XS can't possibly do — at least not without using the IO::Telepathy module.

      Perhaps a better, more generally useful feature would be the ability to assert a specific number of expected fields, either explicitly or implicitly via the column_names method, and then to have some elegant, built-in error checking of the parsed CSV record against the assertion.

      After all, aren't most CSV files in the world essentially flat-file database tables with a constant number of fields in each record — else something is wrong?