in reply to Re^2: Spreadsheet::WriteExcel large files (text versus binary format)
in thread Spreadsheet::WriteExcel large files

What is confusing for me is that both the 6 Mega file and the 100 Mega file are Microsoft office documents that are not ASCAII encoded.... So why is one so much larger than the other?

My guess is that the smaller contains just the results, whereas the the larger contains the formulae used to derive those results. But that is only a guess.

Also, they have the same amount of lines

Using wc -l on binary files is not useful. It only tells you how many bytes with the value 13 decimal it contains. But those bytes are probably not newlines but rather just bytes within packed binary values that happen to look like newlines.

I would have thought your simplest option would be to open each of the files using Excel (or other program that can read .xls files) and inspect what they each contain.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

  • Comment on Re^3: Spreadsheet::WriteExcel large files (text versus binary format)
  • Download Code

Replies are listed 'Best First'.
Re^4: Spreadsheet::WriteExcel large files (text versus binary format)
by mrguy123 (Hermit) on Jan 02, 2012 at 13:08 UTC
    They both contain the same data. The smaller file was created by opening the large file and saving it again (hence my confusion)
    It makes sense that the formulas might have been lost in the process, but it is still surprising that the size difference is so huge.

      .xls files can contain all sorts of stuff. In addition to the formulae and values, they can also contain whole libraries of macrocode; lookup tables; formatting instructions etc. I think they can also contain embedded images and graphs though I'm not sure about that. They are also known to contain all sorts of other crap, some of which can have security implications.

      As you are creating the smaller file by only copying over the values of a range of cells, all that other stuff will not exist in the file created.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        Thanks for your answers!