Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Opening bad Excel files

by Arik123 (Beadle)
on Feb 21, 2018 at 07:30 UTC ( [id://1209625]=perlquestion: print w/replies, xml ) Need Help??

Arik123 has asked for the wisdom of the Perl Monks concerning the following question:

I have an excel file which I need to extract data from. When I open it (with Spreadsheet::XLSX) it reports an empty file - no worksheets. This is not correct. Opening it with Excel itself shows that there is data.

The problem might be that the file wasn't created by Excel itself - it may have been created by another program, which pretends to create 100% compatible Excel files.

Is there any way to convert such files to valid Excel files, or is there another Perl module that can open and process them? I need a permanent solution, to process many such files automatically.

The file can be found here: https://ufile.io/f06gn

Thank you very much!

Replies are listed 'Best First'.
Re: Opening bad Excel files
by Tux (Canon) on Feb 21, 2018 at 10:51 UTC

    There is a lot of environmental settings missing in your question. As others already found: do not use Spreadsheet::XLSX, but Spreadsheet::ParseXLSX.

    When I fetch your file, I can verify the reason to change the parser

    use Spreadsheet::Read; my $ss = Spreadsheet::Read->new ("pm1209625.xlsx", debug => 3);

    $ perl pm1209625.pl Opening XLSX pm1209625.xlsx using Spreadsheet::ParseXLSX-0.27 4 sheets Sheet 2 'Cities' 215 x 1 Sheet 3 'Categories' 22 x 1 Sheet 4 'Items' 187 x 2 Sheet 5 'VehicleTypes' 14 x 1 $ env SPREADSHEET_READ_XLSX=Spreadsheet::XLSX perl pm1209625.pl Opening XLSX pm1209625.xlsx using Spreadsheet::XLSX-0.15 0 sheets

    A short digging revealed that the parser only accepts xml tags sheet, where your workbook has tags /x:workbook at that level (XLSX.pm line 53).

    It is IMHO a lost cause to file a ticket.


    Enjoy, Have FUN! H.Merijn
Re: Opening bad Excel files
by vr (Curate) on Feb 21, 2018 at 09:08 UTC
Re: Opening bad Excel files
by thanos1983 (Parson) on Feb 21, 2018 at 09:49 UTC

    Hello Arik123,

    I download your excel file 'aveidotvalues-3.xlsx' for testing purposes and this is what I found using Spreadsheet::XLSX:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use Text::Iconv; my $converter = Text::Iconv -> new ("utf-8", "windows-1251"); # Text::Iconv is not really required. # This can be any object with the convert method. Or nothing. use Spreadsheet::XLSX; my $excel = Spreadsheet::XLSX -> new ('aveidotvalues-3.xlsx', $convert +er); print Dumper $excel; __END__ $ perl test.pl $VAR1 = bless( { 'SheetCount' => 0, 'FmtClass' => bless( {}, 'Spreadsheet::XLSX::Fmt2007' + ), 'Flg1904' => 0, 'Worksheet' => [] }, 'Spreadsheet::XLSX' );

    I also test your excel file with Spreadsheet::Read and this is the output:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use Spreadsheet::Read qw(ReadData); my $book = ReadData ('aveidotvalues-3.xlsx'); print Dumper $book; __END__ $ perl test.pl $VAR1 = [ { 'parsers' => [ { 'version' => '0.15', 'type' => 'xlsx', 'parser' => 'Spreadsheet::XLSX' } ], 'error' => undef, 'sheet' => {}, 'version' => '0.15', 'sheets' => 0, 'type' => 'xlsx', 'parser' => 'Spreadsheet::XLSX' } ];

    I can not test any Windows spreadsheet modules because I am running a LinuxOS, but what I can see so far both modules they can not see your sheets. This is the reason that the data can not be populated. So what I would recommend, is either use a Perl module to produce your files or download Apache OpenOffice to create your spreadsheets. Remember openoffice is compatible with WindowsOS and also LinuxOS and so far all the files that I have populated it worked correctly with all the Perl modules.

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Opening bad Excel files
by Anonymous Monk on Feb 21, 2018 at 08:08 UTC

    Maybe you can use Win32::OLE module.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1209625]
Approved by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (8)
As of 2024-04-18 09:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found