dd-b has asked for the wisdom of the Perl Monks concerning the following question:
When I try to read the header line of a CSV file that I opened with Unicode encoding (and which actually has some non-ASCII in it, though not I think in the header line) I get the error:
Strings with code points over 0xFF may not be mapped into in-memory fi +le handles readline() on closed filehandle $h at /usr/lib/perl5/site_perl/5.22/i6 +86-cygwin-threads-64int/Text/CSV_XS.pm line 830.
*I'm* not doing file IO on any strings, and the code line given is in Text::CSV not my code.
There's a "Unicode" section to the doc for Text::CSV and I think I did what it said. I verified that turning *off* unicode for that file eliminates this error message. (Since there are actual non-ASCII characters in the file that must be read and comprehended later that's not a long-term solution.)
Any ideas? The symptoms look like Unicode just doesn't work, but the Unicode section in the docs seems pretty clearly to be based on the assumption that it does, and it must be pretty commonly used.
Not much to my code so far, just the start of this bit. It's the $csv->header($ifh) call throws this error.
#! /usr/bin/env perl # Read the export from Thumbs Plus including keywords from filename gi +ven. use warnings; use strict; use utf8; # so literals and identifiers can be in UTF-8 use v5.12; # or later to get "unicode_strings" feature use warnings qw(FATAL utf8); # fatalize encoding glitches #use open qw(:std :utf8); # undeclared streams in UTF-8 #use charnames qw(:full :short); # unneeded in v5.16 use Text::CSV; use Data::Dumper; # debug my $csv = Text::CSV->new ( { binary => 1 } ) or die "Cannot use CSV in: ".Text::CSV->error_diag(); print $ARGV[0],"\n"; open my $ifh, "<:encoding(UTF-8)", $ARGV[0] or die "$ARGV[0]: $!"; print "Point a\n"; # Returns "the instance" -- of what? Do I care? my $thingie = $csv->header ($ifh); print "Point b\n"; print Dumper($csv), "\n";
The first three lines (long lines) of the input file are:
$ head -3 /cygdrive/p/Photos/ThumbsPlus/Thumbs.txt "Volume.label","Volume.serialno","Volume.vtype","Volume.netname","Volu +me.filesystem","Path.name",,"Thumbnail.checksum","Thumbnail.width","T +humbnail.height","Thumbnail.horiz_res","Thumbnail.vert_res","Thumbnai +l.colortype","Thumbnail.colordepth","Thumbnail.gamma","Thumbnail.thum +bnail_width","Thumbnail.thumbnail_height","Thumbnail.thumbnail_type", +"Thumbnail.thumbnail_size","Thumbnail.name","Thumbnail.metric1","Keyw +ords.pkeywords", PCD0138,,4037894171,5,\\ddb\r$,CDFS,PHOTO_CD\IMAGES,1,0,0,"1996-09-30T +21:38:57","2002-10-12T00:29:25",3368960,2147483648,512,768,0,0,0,24,0 +,68,100,518,336,IMG0002.PCD,m0000000000000000000000000000000000000000 +000000000000000000000000000000000000000000000000000000000000000000000 +000000000000000000000000000000000000000000000000000000000000000000000 +00000000000000,b00000000000000000000000000000000,,0,";", PCD0138,,4037894171,5,\\ddb\r$,CDFS,PHOTO_CD\IMAGES,1,0,0,"1996-09-30T +21:38:57","2002-10-12T00:29:25",3354624,2147483648,512,768,0,0,0,24,0 +,68,100,518,336,IMG0003.PCD,m0000000000000000000000000000000000000000 +000000000000000000000000000000000000000000000000000000000000000000000 +000000000000000000000000000000000000000000000000000000000000000000000 +00000000000000,b00000000000000000000000000000000,,0,";",
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Text::CSV on Unicode file
by Tux (Canon) on Jun 07, 2017 at 07:10 UTC | |
by dd-b (Pilgrim) on Jun 07, 2017 at 20:42 UTC | |
by dd-b (Pilgrim) on Jun 07, 2017 at 23:29 UTC | |
by Tux (Canon) on Jun 09, 2017 at 06:29 UTC | |
by dd-b (Pilgrim) on Jun 07, 2017 at 23:46 UTC | |
by Tux (Canon) on Jun 09, 2017 at 06:20 UTC | |
|
Re: Text::CSV on Unicode file
by dd-b (Pilgrim) on Jun 08, 2017 at 04:17 UTC | |
by Tux (Canon) on Jun 09, 2017 at 10:07 UTC | |
by dd-b (Pilgrim) on Jun 09, 2017 at 21:27 UTC |