in reply to Re: dbi::CSV works with text, but not numbers
in thread dbi::CSV works with text, but not numbers

col_names skip_first_row By default DBD::CSV assumes that column names are stored in + the first row of the CSV file and sanitizes them (see "raw_head +er" below). If this is not the case, you can supply an array re +f of table names with the col_names attribute. In that case the attribute skip_first_row will be set to FALSE. If you supply an empty array ref, the driver will read the +first row for you, count the number of columns and create column +names like "col0", "col1", ... Note that column names that match reserved SQL words will c +ause unwanted and sometimes confusing errors. If your CSV has he +aders that match reserved words, you will require these two attri +butes. ... raw_header Due to the SQL standard, field names cannot contain special characters like a dot (".") or a space (" ") unless the col +umn names are quoted. Following the approach of mdb_tools, all + these tokens are translated to an underscore ("_") when reading t +he first line of the CSV file, so all field names are 'sanitized'. I +f you do not want this to happen, set "raw_header" to a true value a +nd the entries in the first line of the CSV data will be used verb +atim for column headers and field names. DBD::CSV cannot guarantee +that any part in the toolchain will work if field names have those characters, and the chances are high that the SQL statement +s will fail.

I'll think of a more clear way to document what this sanatation actually means.


Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^3: dbi::CSV works with text, but not numbers
by Sandy_Bio_Perl (Beadle) on May 20, 2016 at 10:11 UTC

    Thank you for your help. Very much appreciated

      Digging a bit deeper, just to satisfy my own curiosity (as I could not remember lower-casing the headers), here's the cause of your column name folding:

      From the SQL::Statement manual pages:

      · Wildcards are expanded to lower cased identifiers. This mig +ht confuse some people, but it was easier to implement. The warning in DBI to never trust the case of returned colu +mn names should be read more often. If you need to rely on identifie +rs, always use "sth->{NAME_lc}" or "sth->{NAME_uc}" - never rel +y on "sth->{NAME}": $dbh->{FetchHashKeyName} = "NAME_lc"; $sth = $dbh->prepare ("SELECT FOO, BAR, ID, NAME, BAZ FRO +M TABLE"); $sth->execute; $hash_ref = $sth->fetchall_hashref ("id"); print "Name for id 42 is $hash_ref->{42}->{name}\n"; See "FetchHashKeyName" in DBI for more information.

      As select * from selects on a wildcard, this situation comes into effect. I'll add an explicit note to the docs of DBD::CSV.

      Note in case of DBD::CSV that you can check what the driver's view on reality is by inspecting $dbh->{csv_tables}{tablename}{col_names}. My test showed me that this still holds the original folding.

      Personally, I always use NAME_lc in all my database interfaces, so I never hit this problem.


      Enjoy, Have FUN! H.Merijn