Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

i compress an xml file using some compression technique (e.g. bzip2). i check the size of the compressed data in memory. then i write this compressed data into a file. more data gets written into the file than was intended. the size of the file shows that it has more data than size of the compressed data in memory. e.g. length of string in memory: 13k write this string into a file, its size is reported as 20k in the file. read the contents of this file into memory, and check the size, it shows 13k! this doesn't happen to original data that has not been compressed. also, if i encode the compressed file, then it doesnt show this problem. note that when the compressed file is encoded, it no longer remains binary data. my guess is that this has something to do with binary data since compression produces binary data. i want to keep the compressed data as it is, even when i write it into the file i want it to be exactly the same as it is in memory. maybe i am missing something very fundamental. any pointers/solution will really help. thanks.

Replies are listed 'Best First'.
forget binmode?
by PodMaster (Abbot) on Jun 10, 2004 at 09:45 UTC
    Did you forget to binmode your filehandle? Or did you binmode it to the wrong thing? There are no mind readers here.

    update: How are you verifying the file size?

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

Re: binary data metamorphosis?
by ambrus (Abbot) on Jun 10, 2004 at 11:06 UTC
    ... its size is reported as 20k in the file. read the contents of this file into memory, and check the size, it shows 13k!

    Are you sure that 20k is really the file size, not the size it occupies on the disk (st_blocks)? Note that dd or ls -s reports that st_blocks size. You want the real size of the file (st_size) as reported by ls -l or wc -c.

binmode on linux
by Anonymous Monk on Jun 10, 2004 at 10:00 UTC
    i had thought abt binmode but when i read up on it i found out that this binmode thing is not required in UNIX type machines. do u mean to say that i still have to use binmode even though i am running perl on linux? maybe i should have first tried binmode anyway despite wht the books said. well i'll just check it now. thanks.
Re: binary data metamorphosis?
by iburrell (Chaplain) on Jun 10, 2004 at 16:21 UTC
    Show some code. How are you writng the compressed data to the file? How are you determining the size of the data in memory and in the file?