socrtwo has asked for the wisdom of the Perl Monks concerning the following question:

I am a newbie Perl user with only one successful useful Perl script so far. I am trying to create a GUI xlsx2csv converter. I am using the Spreadsheet::XLSX module and Ken Prows xls2csv scripts as one big script not as modules because I need to use a different unzipper. Frankly also I'm doing this because I don't know how to how to rewrite a module and install it and how once installed, to invoke the Spreadsheet::XLSX as a module.

Anyway I'm using CakeCMD as the unzipper because it tolerates partly corrupt xml files. I'm hoping that none of the modules in this attempt at a script below or any of the code, does not tolerate well formed XML. My real aim is to write a corrupt xlsx2txt converter, but if I can make one work for corrupt files it should work fine for noncorrupt ones right? And that might be useful for others...

Anyway I have simplified the script, commenting out modules stuff and am just trying to make it convert one xlsx to a csv at this point. However, I keep getting this message: "Tk::Error: Can't call method "contents" on an undefined value at line 226". This is the line where $member_workbook is called and it's contents are looked at to see if it follows the 1904 date calendar of Excel.

I got an answer from a Perl instructor who said I was defining the scalar within a local scope above and therefore when the bracket got closed the variable returned to nothing. Here's the code in question:

{ local $/=undef; open FILE, "xl/workbook.xml" or die "Couldn't open file: $!"; binmode FILE; my $member_workbook = <FILE>; close FILE; }
I really don't how to read a file into a variable, but in my previous script this code snippet that I use worked fine. Also above line 226 in the previous section, I do the same thing without an error, although I was told that what exists in the previous section is an if statement that might not reach my attempted to be defined variable

Can someone tell me what's going wrong and if there is a better way to read the physical xl/document.xml file into the $member_workbook scalar?

I also contacted the author of Dmitry Ovsyanko, the writer of Spreadsheet::XLSX, and he has been some help, but he may be too advanced for me, and English is his 3rd language so we are having some difficulty communicating. I may hire him to write the script, acting as a patron I guess.

Thanks in advance. Also if you can suggest a shorter way to code the script that would be great too. Unfortunately CakeCMD cannot write to STDIN so I have to create real temporary files consisting of the necessary extracted XML ones.

#!/usr/local/bin/perl # package Spreadsheet::XLSX; # use 5.006000; # use strict; # use warnings; # our @ISA = qw(); # our $VERSION = '0.1'; # use Archive::Zip; use Spreadsheet::XLSX::Fmt2007; use Data::Dumper; use Spreadsheet::ParseExcel; ###################################################################### +########## use Tk; # # Create the Main Window # my $mw = new MainWindow; # # Hides TK logo with my own logo # # my $icon = $mw->Photo(-file => 'xlsx.gif'); # $mw->iconimage($icon); # # Declare that there is a menu, create text # editor and create a vertical scroll bar # my $mbar = $mw -> Menu(); $mw -> configure(-menu => $mbar); # # Main Menu Choices Setup section # my $file = $mbar -> cascade(-label=>"File", -underline=>0, -tearoff => + 0); # my $help = $mbar -> cascade(-label =>"Help", -underline=>0, -tearoff + => 0); # # File Menu Choices section # $file -> checkbutton(-label =>"Open", -underline => 0, -command => [\&menuopenClicked, "Open"]); $file -> command(-label =>"Save", -underline => 0, -command => [\&menusavedClicked, "Save"]); $file -> separator(); $file -> command(-label =>"Exit", -underline => 1, -command => sub { exit } ); # # Help Menu Choices section # # $help -> command(-label =>"About", -command => sub { # $txt->delete('1.0','end'); # $txt->insert('end', # "About # ---------- # How to use this program: # 1. Click on the File Menu and choose Opem. # 2. Choose your xlsx Word 2007 file to which you # wish to convert to csv files. # # # This program is made by Paul D Pruitt (socrtwo) # It also uses Binny V A's Perl/Tk code for the GUI elements # from http://www.geocities.com/binnyva/code. # CakeCMD is by Leung Yat Chun Joseph. # http://www.quickzip.org/softwares-cakecmd # It requires Microsoft .NET Framework Version 2.0 # http://tinyurl.com/ms2-0-netframework # # My software website is # http://www.godskingsandheroes.info/software/. # Also visit my data recovery software list # http://www.s2services.com. # My E-Mail : socrtwo\@s2services"); }); # # Open Dialog Box File Extension Declaration section # # # # Main loop currently activated by selecting the file # MainLoop; sub menuopenClicked { my $typesopen = [ ['Excel 2007 files', '.xlsx'], ['All files', '*'],]; my $mainfilepath = $mw->getOpenFile(-filetypes => $typesop +en, -defaultextension => '.xlsx'); # return if undefined $mainfilepath; # # Old code from package Spreadsheet::XLSX; # sub new { # my ($class, $filename, $converter) = @_; # # my $self = {}; # # Delete old XML data found in the word folder section # use Win32::OLE; $dir = '<xl>'; if(-e $dir){ $Win32::OLE::Warn = 3; # ------ SCRIPT CONFIGURATION ------ $strFolderPath = '<xl>'; # e.g. "d:\temp" # ------ END CONFIGURATION --------- $objFSO = Win32::OLE->new('Scripting.FileSystemObject'); $objFSO->DeleteFolder($strFolderPath); } else { print "\n"; } # # xlsx file rename to zip section necessary for CakeCMD to unzip. # my $zipexcelfilepath = $mainfilepath . '.zip'; rename($mainfilepath,$zipexcelfilepath); # # unzipping here uses cakecmd: http://quickzipdev.googlegroups.com/web +/Cake_Command_Line_0_3.zip # Unzip docx/zip file section # Unzip xl/styles.xml # Unzip xl/workbook.xml # my $unzip = "cakecmd.exe"; open my $wfh, "| $unzip extract \"$zipexcelfilepath\" xl/sharedStrings +.xml \"\" " or die "Could not start $unzip: $!"; open $wfh, "| $unzip extract \"$zipexcelfilepath\" xl/styles.xml \"\" +" or die "Could not start $unzip: $!"; open $wfh, "| $unzip extract \"$zipexcelfilepath\" xl/workbook.xml \"\ +" " or die ("xl/workbook.xml not found in this zip\n"); # # Script sleep section to allow unzipping action to catch up with scri +pt # $num = 2; while($num--){ sleep(1); } close $zipexcelfilepath; # # Read in xl/sharedStrings.xml into $member_shared_strings # { local $/=undef; open FILE, "xl/sharedStrings.xml" or die "Couldn't open file: $!"; binmode FILE; my $member_shared_strings = <FILE>; close FILE; } # # Old code from package Spreadsheet::XLSX; # my $member_shared_strings = $self -> {zip} -> memberNamed ('xl/sh +aredStrings.xml'); # my @shared_strings = (); if ($member_shared_strings) { my $mstr = $member_shared_strings->contents; $mstr =~ s/<t\/>/<t><\/t>/gsm; # this handles an empty t tag +in the xml <t/> foreach my $si ($mstr =~ /<si.*?>(.*?)<\/si/gsm) { my $str; foreach my $t ($si =~ /<t.*?>(.*?)<\/t/gsm) { $t = $converter -> convert ($t) if $converter; $str .= $t; } push @shared_strings, $str; } } # # Read in xl/styles.xml into $member_styles # { local $/=undef; open FILE, "xl/styles.xml" or die "Couldn't open file: $!"; binmode FILE; my $member_styles = <FILE>; close FILE; } # my $member_styles = $self -> {zip} -> memberNamed ('xl/styles. +xml'); my @styles = (); my %style_info = (); if ($member_styles) { foreach my $t ($member_styles -> contents =~ /xf\ numF +mtId="([^"]*)"(?!.*\/cellStyleXfs)/gsm) { #" # $t = $converter -> convert ($t) if $converter +; push @styles, $t; } my $default = $1 || ''; foreach my $t1 (@styles){ $member_styles -> contents =~ /numFmtId="$t1" formatCode=" +([^"]*)/; my $formatCode = $1 || ''; if ($formatCode eq $default || not($formatCode)){ if ($t1 == 9 || $t1==10){ $formatCode="0.00000%";} elsif ($t1 == 14){ $formatCode="m-d-yy";} else { $formatCode=""; } } $style_info{$t1} = $formatCode; $default = $1 || ''; } } # # Old code from package Spreadsheet::XLSX; # my $member_workbook = $self -> {zip} -> memberNamed ('xl/workbook +.xml') or die ("xl/workbook.xml not found in this zip\n"); # # # Read in xl/styles.xml into $member_workbook # { local $/=undef; open FILE, "xl/workbook.xml" or die "Couldn't open file: $!"; binmode FILE; my $member_workbook = <FILE>; close FILE; } # my $oBook = Spreadsheet::ParseExcel::Workbook->new; $oBook->{SheetCount} = 0; $oBook->{FmtClass} = Spreadsheet::XLSX::Fmt2007->new; $oBook->{Flg1904}=0; if ($member_workbook -> contents =~ /date1904="1"/){ $oBook->{Flg1904}=1; } my @Worksheet = (); foreach ($member_workbook -> contents =~ /\<(.*?)\/?\>/g) { /^(\w+)\s+/; my ($tag, $other) = ($1, $'); my @pairs = split /\" /, $other; $tag eq 'sheet' or next; my $sheet = { MaxRow => 0, MaxCol => 0, MinRow => 1000000, MinCol => 1000000, }; foreach ($other =~ /(\S+=".*?")/gsm) { my ($k, $v) = split /=?"/; #" if ($k eq 'name') { $sheet -> {Name} = $v; $sheet -> {Name} = $converter -> convert ($sheet -> {N +ame}) if $converter; } elsif ($k eq 'r:id') { ($sheet -> {Id}) = $v =~ m{rId(\d+)}; }; } my $wsheet = Spreadsheet::ParseExcel::Worksheet->new(%$sheet); push @Worksheet, $wsheet; $oBook->{Worksheet}[$oBook->{SheetCount}] = $wsheet; $oBook->{SheetCount}+=1; } $self -> {Worksheet} = \@Worksheet; foreach my $sheet (@Worksheet) { my $member_name = "xl/worksheets/sheet$sheet->{Id}.xml"; # # Unzip each worksheets # open $wfh, "| $unzip extract \"$zipexcelfilepath\" $member_name \"\" " + or next; $num = 1; while($num--){ sleep(1); } close $zipexcelfilepath; # # Old code from package Spreadsheet::XLSX; # my $member_sheet = $self -> {zip} -> memberNamed ($member_nam +e) or next; # my ($row, $col); my $flag = 0; my $s = 0; my $s2 = 0; my $sty = 0; foreach ($member_sheet -> contents =~ /(\<.*?\/?\>|.*?(?=\<))/ +g) { if (/^\<c r=\"([A-Z])([A-Z]?)(\d+)\"/) { $col = ord ($1) - 65; if ($2) { $col++; $col *= 26; $col += (ord ($2) - 65); } $row = $3 - 1; $s = m/t=\"s\"/ ? 1 : 0; $s2 = m/t=\"str\"/ ? 1 : 0; $sty = m/s="([0-9]+)"/ ? $1 : 0; } elsif (/^<v/) { $flag = 1; } elsif (/^<\/v/) { $flag = 0; } elsif (length ($_) && $flag) { my $v = $s ? $shared_strings [$_] : $_; if ($v eq "</c>"){$v="";} my $type = "Text"; my $thisstyle = ""; if (not($s) && not($s2)){ $type="Numeric"; $thisstyle = $style_info{$styles[$sty]}; if ($thisstyle =~ /(?<!Re)d|m|y/){ $type="Date"; } } $sheet -> {MaxRow} = $row if $sheet -> {MaxRow} < $row +; $sheet -> {MaxCol} = $col if $sheet -> {MaxCol} < $col +; $sheet -> {MinRow} = $row if $sheet -> {MinRow} > $row +; $sheet -> {MinCol} = $col if $sheet -> {MinCol} > $col +; if ($v =~ /(.*)E\-(.*)/gsm && $type eq "Numeric"){ $v=$1/(10**$2); # this handles scientific notatio +n for very small numbers } my $cell =Spreadsheet::ParseExcel::Cell->new( Val => $v, Format => $thisstyle, Type => $type ); $cell->{_Value} = $oBook->{FmtClass}->ValFmt($cell, $o +Book); if ($type eq "Date" && $v<1){ #then this is Excel tim +e field $cell->{Type}="Text"; $cell->{Val}=$cell->{_Value}; } $sheet -> {Cells} [$row] [$col] = $cell; } } $sheet -> {MinRow} = 0 if $sheet -> {MinRow} > $sheet -> {MaxR +ow}; $sheet -> {MinCol} = 0 if $sheet -> {MinCol} > $sheet -> {MaxC +ol}; } foreach my $stys (keys %style_info){ } bless ($self, $class); return $oBook; } # # I don't know what this line does # 1; # # Start of Ken Prows' xls2csv script # # use strict; # use 5.006; # use Getopt::Std; use Locale::Recode; use Spreadsheet::ParseExcel; use Spreadsheet::ParseExcel::FmtUnicode; use Text::CSV_XS; # our $VERSION = '1.06'; # =head1 NAME # xls2csv - A script that recodes a spreadsheet's charset and saves as + CSV. # =head1 DESCRIPTION # This script will recode a spreadsheet into a different character set # and output the recoded data as a csv file. # The script came about after many headaches from dealing with Excel s +preadsheets # from clients that were being received in various character sets. # =head1 OPTIONS # -x : filename of the source spreadsheet # -b : the character set the source spreadsheet is in (before) # -c : the filename to save the generated csv file as # -a : the character set the csv file should be converted to ( +after) # -q : quiet mode # -s : print a list of supported character sets # -h : print help message # -v : get version information # -W : list worksheets in the spreadsheet specified by -x # -w : specify the worksheet name to convert (defaults to the +first worksheet) # =head1 EXAMPLE USAGE # The following example will convert a spreadsheet that is in the WIND +OWS-1252 character set (WinLatin1) # and save it as a csv file in the UTF-8 character set. # xls2csv -x "1252spreadsheet.xls" -b WINDOWS-1252 -c "ut8csvfile. +csv" -a UTF-8 # This example with convert the worksheet named "Users" in the given s +preadsheet. # xls2csv -x "multi_worksheet_spreadsheet.xls" -w "Users" -c "user +s.csv" # =head1 NOTES # The spreadsheet's charset (-b) will default to UTF-8 if not set. # If the csv's charset (-a) is not set, the CSV file will be created u +sing the same charset as the spreadsheet. # =head1 REQUIRED MODULES # This script requires the following modules: # Locale::Recode # Unicode::Map # Spreadsheet::ParseExcel # Spreadsheet::ParseExcel::FmtUnicode (should be included with Sp +readsheet::ParseExcel) # Text::CSV_XS # =head1 CAVEATS # It probably will not work work with spreadsheets that use formulas. # A line in the spreadsheet is assumed to be blank if there is nothing + in the first column. # Some users have reported problems trying to convert a spreadsheet wh +ile it was opened in a different application. # You should probably make sure that no other programs are working wit +h the spreadsheet while you are converting it. # =cut # $Getopt::Std::STANDARD_HELP_VERSION = 1; # my %O; # getopts('x:b:c:a:qshvWw:', \%O); # HELP_MESSAGE() if !%O or $O{'h'}; # VERSION_MESSAGE() if $O{'v'}; # if ($O{'s'}) # { # print "\nThe following character sets are supported:\n\n"; # my $Supported = Locale::Recode->getSupported; # foreach my $CharSet (sort @$Supported) # { # print "$CharSet\n"; # } # print "\n"; # exit; # @ } # my $SourceFilename = $oBook; # $O{'x'} || die "The filename of the spr +eadsheet (-x) is required."; # my $SourceCharset = $O{'b'}; $SourceCharset = 'UTF-8'; # unless $SourceCharset; # #unless ($O{'q'}) #{ print "Now reading \"$SourceFilename\" as $SourceCharset.\n"; #} my $XLS = new IO::File; $XLS->open("< $SourceFilename") || die "Cannot open spreadsheet: $!"; my $Formatter = Spreadsheet::ParseExcel::FmtUnicode->new(Unicode_Map = +> $SourceCharset); my $Book = Spreadsheet::ParseExcel::Workbook->Parse($XLS, $Formatter) +|| die "Can't read spreadsheet!"; # if ($O{'W'}) #{ # print "\nThe following " . ($Book->{SheetCount}>1 ? "$Book->{Shee +tCount} worksheets are" : "worksheet is") . " defined in the spreads +heet:\n\n"; #foreach my $Sheet (@{$Book->{Worksheet}}) # { # print "$Sheet->{Name}\n"; # } # print "\n"; # exit; # } my $DestFilename = $SourceFilename . 'csv'; # $O{'c'} || die "The file +name to save the csv file as (-c) is required."; my $DestCharset = 'UTF-8'; # $O{'a'}; # $DestCharset = $SourceCharset unless $DestCharset; # my $Sheet; # if ($O{'w'}) # { # $Sheet = $Book->Worksheet($O{'w'}); # die "Invalid worksheet" if !defined $Sheet; # unless ($O{'q'}) # { # print qq|Converting the "$Sheet->{Name}" worksheet.\n|; # } #} #else # my $Sheet = @{$Book->{Worksheet}}; # if #(!$O{'q'} && # ($Book->{SheetCount}>1) # { # print qq|Multiple worksheets found. Will convert the "$Sheet- +>{Name}" worksheet.\n|; # } open (CSV, "> $DestFilename") || die "Cannot create csv file: $!" ; binmode CSV; my $Csv = Text::CSV_XS->new({ 'quote_char' => '"', 'escape_char' => '"', 'sep_char' => ',', 'binary' => 1, }); my $Recoder; # if ($O{'a'}) { $Recoder = Locale::Recode->new(from=>$SourceCharset, to=>$DestChar +set); } my $Sheet; for ( my $Row = $Sheet->{MinRow} ; defined $Sheet->{MaxRow} && $Row <= + $Sheet->{MaxRow} ; $Row++ ) { my @Row; for ( my $Col = $Sheet->{MinCol} ; defined $Sheet->{MaxCol} && $Co +l <= $Sheet->{MaxCol} ; $Col++ ) { my $Cell = $Sheet->{Cells}[$Row][$Col]; my $Value = ""; if ($Cell) { $Value = $Cell->Value; if ($Value eq 'GENERAL') { # Sometimes numbers are read incorrectly as "GENERAL". # In this case, the correct value should be in ->{Val} +. $Value = $Cell->{Val}; } # if ($O{'a'}) #{ # $Recoder->recode($Value); #} } # We assume the line is blank if there is nothing in the first + column. last if $Col == $Sheet->{MinCol} and !$Value; push(@Row, $Value); } next unless @Row; my $Status = $Csv->combine(@Row); if (#!$O{'q'} and !defined $Status) { my $Error = $Csv->error_input(); warn "ERROR FOUND!: $Error"; } if (defined $Status) { my $Line = $Csv->string(); print CSV "$Line\n"; } } close CSV; $XLS->close; # unless ($O{'q'}) print "The spreadsheet has been converted to $DestCharset and saved as + \"$DestFilename\".\n"; # sub VERSION_MESSAGE # { # print << "EOF"; # This is xls2csv version $VERSION # Copyright (C) 2005 Ken Prows. All rights reserved. # This script is free software; you can redistribute it and\\or modify + it under the same terms as Perl itself. # For help, use "xls2csv -h" # EOF # exit; # } # #sub HELP_MESSAGE #{ # print << "EOF"; # xls2csv - Recode a spreadsheet's charset and save as CSV. # usage: xls2csv -x spreadsheet.xls [-w worksheet] [-b charset] [-c cs +vfile.csv] [-a charset] [-qshvW] # -x : filename of the source spreadsheet # -b : the character set the source spreadsheet is in (before) # -c : the filename to save the generated csv file as # -a : the character set the csv file should be converted to (after) # -q : quiet mode # -s : print a list of supported character sets # -h : this help message # -v : get version information # -W : list worksheets in the spreadsheet specified by -x # -w : specify the worksheet name to convert (defaults to the first w +orksheet) # example: xls2csv -x "spreadsheet.xls" -b WINDOWS-1252 -c "csvfile.cs +v" -a UTF-8 # More detailed help is in "perldoc xls2csv" # EOF # exit; #} # # =head1 AUTHOR # Ken Prows (perl@xev.net) # =head1 COPYRIGHT # Copyright (C) 2005 Ken Prows. All rights reserved. # This script is free software; you can redistribute it and/or modify +it under the same terms as Perl itself.

Replies are listed 'Best First'.
Re: Tk::Error: Can't Call Method "contents" on an Undefined Value at...
by GrandFather (Saint) on Apr 09, 2009 at 04:50 UTC

    At the point where you use $member_workbook it isn't declared. You declared a lexical $member_workbook in the block above, but it is local to that block. If you hadn't of turned off strictures you'd have picked the error up immediately. Never turn off strictures!

    You'd have also found the error if you'd tried to cut the code down to a small runable sample instead of posting a huge unrunable monstrosity with multiple OS specific dependencies.

    Update: $converter, $class and $self will give you trouble for similar reasons, but the fix is less clear. $member_styles suffers from the same issue as $member_workbook. There are a few more of the same sort.


    True laziness is hard work