All feedback weighed, I have now committed and pushed:


header

This method does NOT work in perl-5.6.x

Parse the CSV header and set sep_char and encoding.

my @hdr = $csv->header ($fh)->column_names; $csv->header ($fh, [ ";", ",", "|", "\t" ]); $csv->header ($fh, { bom => 1, fold => "lc" }); $csv->header ($fh, [ ",", ";" ], { bom => 1, fold => "lc" });

The first argument should be a file handle.

Assuming that the file opened for parsing has a header, and the header does not contain problematic characters like embedded newlines, read the first line from the open handle, auto-detect whether the header separates the column names with a character from the allowed separator list. That list defaults to [ ";", "," ] and can be overruled with an optional argument of an anonymous list of allowed separator sequences. If any of the allowed separators matches, and none of the other allowed separators match, set sep_char to that sequence for the current CSV_XS instance and use it to parse the first line, map those to lowercase, use that to set the instance column_names and return the instance:

my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1 }); open my $fh, "<:encoding(iso-8859-1)", "file.csv"; $csv->header ($fh); while (my $row = $csv->getline_hr ($fh)) { ... }

If the header is empty, contains more than one unique separator out of the allowed set, contains empty fields, or contains identical fields (after folding), it will croak with error 1010, 1011, 1012, or 1013 respectively.

This method will return the instance on success or undefined on failure if it did not croak.

Options

bom
 $csv->header ($fh, { bom => 1 });

The default behavior is to detect if the header line starts with a BOM. If the header has a BOM, use that to set the encoding of $fh. This default behavior can be disabled by passing a false value to the bom option.

Supported encodings from BOM are: UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE, UTF-1, UTF-EBCDIC, SCSU, BOCU-1, and GB-18030. UTF-7 is not supported.

This is Work-In-Progress. currently only UTF-8 is working as expected

fold
 $csv->header ($fh, { fold => "lc" });

The default is to fold the header to lower case. You can also choose to fold the headers to upper case with { fold => "uc" } or to leave the fields as-is with { fold => "none" }.

columns
 $csv->header ($fh, { columns => 1 });

The default is to set the instances column names using column_names if the method is successful, so subsequent calls to getline_hr can return a hash. Disable setting the header can be forced using a false value for this option like { columns => 0 }.


Enjoy, Have FUN! H.Merijn

In reply to Re: CSV headers. Feedback wanted by Tux
in thread CSV headers. Feedback wanted by Tux

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.