JackVanRamars has asked for the wisdom of the Perl Monks concerning the following question:

Hey,

In regional settings all this country uses semicolon instead comma as CSV separator so I need to replace default comma separator with semicolon. I use Spreadsheet::Read module.

#!/usr/bin/perl use strict; use Pod::Usage; use Getopt::Std; use Spreadsheet::Read; use Spreadsheet::WriteExcel; my $ref = ReadData ("test.csv");

While this works for any XLS file I have problems with CSV files. The $ref variable remains empty. I even added use Text::CSV_XS (or Text::CSV_PP) but $ref remained empty.

Then I modified the last line:

my $ref = ReadData ("test.csv", sep => ';');

or

my $ref = ReadData ("test.csv", sep => ';',quote => '"');

Before I type the last line debugger reports the following error (in the line with one or both arguments):

Too many arguments for Spreadsheet::Read::ReadData

The CSV parsing module theoretically should detect separator - but it doesn't. I'm out of ideas.

Thanks in advance for your help!

Replies are listed 'Best First'.
Re: Parsing CSV table
by Tux (Canon) on Sep 06, 2010 at 13:31 UTC

    That message indicates that you are using a very old Spreadsheet::Read, as ReadData () accepts (multiple) options since at least 2007 (that is how far my git repo goes back).

    $ cat test.csv foo;bar;dog 1;2;Bull $ perl -MData::Peek -MSpreadsheet::Read -wle'my $ref = ReadData ("test +.csv", sep => ";",quote => q{"});DDumper $ref' [ { parser => 'Text::CSV_XS', quote => '"', sepchar => ';', sheet => { 'test.csv' => 1 }, sheets => 1, type => 'csv', version => '0.73' }, { A1 => 'foo', A2 => 1, B1 => 'bar', B2 => 2, C1 => 'dog', C2 => 'Bull', attr => [], cell => [ [], [ undef, 'foo', 1 ], [ undef, 'bar', 2 ], [ undef, 'dog', 'Bull' ] ], label => 'test.csv', maxcol => 3, maxrow => 2 } ] $

    Enjoy, Have FUN! H.Merijn
Re: Parsing CSV table
by roboticus (Chancellor) on Sep 06, 2010 at 13:16 UTC

    JackVanRamars:

    I'm not familiar with the Spreadsheet::Read module, but it appears that you're using it correctly. Hopefully another monk will be able to help you with that if you need to use Spreadsheet::Read. When I saw your node title, though, I immediately thought of Text::CSV. Have you tried it? It may simplify things for you.

    ...roboticus

    Update: I just tried installing Spreadsheet::Read to take a look at it, and discovered that Text::CSV is one of the requirements, so you should already have it installed on your system. If I get the problem with Spreadsheet::Read setting up Text::CSV, I'll start a new reply node for you. But you may want to throw out a couple lines of test data for monks to test their suggestions against.

Re: Parsing CSV table
by Khen1950fx (Canon) on Sep 07, 2010 at 04:08 UTC
    I don't use Windows, but "I'm Feeling Lucky" should work for you.
    #!/usr/bin/perl use strict; use warnings; use Pod::Usage; use Getopt::Std; use Spreadsheet::Read; use Spreadsheet::WriteExcel; use Text::CSV::Separator; my $ref = ReadData ("test.csv"); my $separator = get_separator( path => '/tmp', lucky => 1, exclude => [',', ':', '|'] );
Re: Parsing CSV table
by JackVanRamars (Novice) on Sep 09, 2010 at 04:46 UTC

    >That message indicates that you are using a very old Spreadsheet::Read, as ReadData () accepts (multiple) options since at least 2007 (that is how far my git repo goes back).

    I started using Spreadsheet modules 3 or 4 months ago so they can't be old. I even performed upgrade Spreadsheet-Read within ppm and it returned Spreadsheet-Read 0.03: up to date..

    Since the module refuses to accept sep parameter (semicolon as separator) I decided to write something by myself. At first I used the Data::Dumper to check what characters I read at all: it reported that characters mentioned below don't map to CP1250. I open a CSV file and use regex to perform replacements: every non-cp1250 character is replaced with unicode character. When I write data into a XLS file using Spreadsheet::WriteExcel the national characters are written correctly.

    ... open($fh,"Table.csv"); @Table=<$fh>; close($fh); foreach $Table(@Table){ $Table=~s/\x{00c8}/\x{010c}/g; # capital C with caron $Table=~s/\x{008a}/\x{0160}/g; # capital S with caron $Table=~s/\x{008e}/\x{017e}/g; # capital Z with caron $Table=~s/\x{008a}/\x{010d}/g; # small c with caron $Table=~s/\x{00e8}/\x{0161}/g; # small s with caron $Table=~s/\x{00e8}/\x{017f}/g; # small z with caron # follows the code parsing semicolons } # writing data into XLS file

      Spreadsheet::Read version 0.03 is from 19 May 2005! The current version is 0.40. It's not my fault that your source for modules doesn't keep them up to date.

      Newer versions of ActivePerl now also support the use of cpan, so you could try to see if the most recent version will install.


      Enjoy, Have FUN! H.Merijn