I've gotten the ball to the 4 yard line on a problem I've been working on wrt to the Spreadsheet::XLSX module. Looking for help banging this problem into the end zone. I don't know enough about XLSX parsing to properly fix this.

The problem:

I'm parsing a .xlsx file generated by quickbooks. The spreadsheet has some custom format codes in it. Here's the appropriate xml from the styles.xml file embedded in the xlsx archive:

<numFmt numFmtId="165" formatCode="#,##0.00\ _€"/><numFmt numFmtId="16 +6" formatCode="&quot;$&quot;* #,##0.00\ _€"/>

When extracting data from the cell with this custom format code, the value looked like this:

'20.00  €'

Note the euro symbol tacked on to the end. This symbol does not show up when viewing the spreadsheet.

The hack:

To get this working the way I wanted, I hacked the Spreadsheet::XLSX module. I made two changes. One to the __load_styles function:

sub __load_styles { my ($zip) = @_; my $member_styles = $zip->memberNamed('xl/styles.xml'); my @styles = (); my %style_info = (); if ($member_styles) { my $formatter = Spreadsheet::XLSX::Fmt2007->new(); foreach my $t ($member_styles->contents =~ /xf\ numFmtId="([^" +]*)"(?!.*\/cellStyleXfs)/gsm) { #" push @styles, $t; } my $default = $1 || ''; foreach my $t1 (@styles) { $member_styles->contents =~ /numFmtId="$t1" formatCode="([ +^"]*)/; my $formatCode = $1 || ''; if ($formatCode eq $default || not($formatCode)) { if ($t1 == 9 || $t1 == 10) { $formatCode = '0.00000%'; } elsif ($t1 == 14) { $formatCode = 'yyyy-mm-dd'; } else { $formatCode = ''; } } $formatCode = $formatter->FmtStringDef($t1); $style_info{$t1} = $formatCode; $default = $1 || ''; } } return (\@styles, \%style_info); }

The other change I made was to the Spreadsheet::XLSX::Fmt2007.pm module where I added in the hex values for "165" and "166" along with the format I wanted without the euro sign:

0xA5 => '#,##0.00', 0xA6 => '&quot;$&quot;* #,##0.00',

Obviously this is a bad hack. What is the proper way to fix these modules so I can have custom fomatters that work as intended when parsing an xlsx file?

$PM = "Perl Monk's";
$MCF = "Most Clueless Friar Abbot Bishop Pontiff Deacon Curate Priest Vicar";
$nysus = $PM . ' ' . $MCF;
Click here if you love Perl Monks


In reply to Looking for assistance for proper fix for Spreadsheet::XLSX bug(?) by nysus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.