Re: File splitting script
by sgifford (Prior) on Jan 25, 2007 at 17:33 UTC
|
It's possible for read to return fewer bytes than you actually asked for, especially if reading from a pipe, and almost certainly at the end of the file. This is one possibility; try seeing how many bytes were actually read, and subtracting that from $fsize instead. You should also check for errors in your read, print, and close calls; it's good practice, and it's possible one of those is failing causing the problem.
It should be straightforward to troubleshoot this by printing out the values of the various counters, and seeing where things go wrong.
Also, what OS are you on, and where did your Perl come from?
| [reply] [d/l] [select] |
Re: File splitting script
by ambrus (Abbot) on Jan 26, 2007 at 10:56 UTC
|
It's probably unrelated to the error, but you should binmode the output file as well if you binmode the input file.
Also, for simple splitting of a file to chunks, it might be easier to use the split program from coreutils.
| [reply] |
Re: File splitting script
by BrowserUk (Patriarch) on Jan 26, 2007 at 11:20 UTC
|
I think you found a bug in perl. The length parameter is being treated as an unsigned integer (UV), but is being stored as a signed integer (IV). (Or is it the other way around?). In any case, any attempt to specify a length of greater than 2**31-1 results in the "Negative length" error. (Tested on AS811 and AS817 under XP):
[0] Perl> open I, '<:raw', '32GB.dat.bz2';;
[0] Perl> read( I, $c, 2147483648 );;
[Negative length at (eval 4) line 1, <STDIN> line 2.
] Perl> read( I, $c, 2147483647 );;
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
Re: File splitting script
by ferreira (Chaplain) on Jan 25, 2007 at 17:31 UTC
|
I haven't looked deeper, but some comments may help you find the offending code:
- it looks like you're reading an entire chunk of size $fsize into memory to write it to disk — maybe you should consider read and write with a buffer, something like: turn 1GB file into files of 100MB but reading/writing in 10MB chunks. It may make more sense if your files are larger than your RAM memory.
Since you're using read, the write function would be more appropriate as a counterpart than print.
- Since you're using read, you may take advantage of the return which is the number of bytes read to use it for counting how much was actually read so far.
- With such a scheme of reading blocks until you fill a part and then going on until you exhaust the original file, you didn't even need to query for the file size.
And the error you're getting should be related to this statement:
$fsize-=$size;
that will produce a negative size eventually if $fsize is not a multiple of $size.
Update: as pointed by johngg, I messed up things thinking about a pair read/write when there's only sysread/syswrite in Perl. print is just fine as a counterpart of read.
| [reply] [d/l] [select] |
|
|
Since you're using read, the write function would be more appropriate as a counterpart than print.
I don't think that's right. The write function is for writing formatted records, from the documentation
Writes a formatted record (possibly multi-line) to the specified FILEHANDLE, using the format associated with that file.
There is no counterpart to read per se, just use print. Perhaps you were confusing write with syswrite which is the counterpart of sysread. Cheers, JohnGG
| [reply] |
Re: File splitting script
by kyle (Abbot) on Jan 25, 2007 at 17:54 UTC
|
I ran it with a file 1142547634 bytes in size, asked for 1024000 byte pieces, and it worked.
| [reply] |
|
|
I tried to split a 6GB file in 2 files of 3GB each ... and it dies with that error . The script works fine for lesser amounts of size
| [reply] |
|
|
Note that though this (probably) is a bug in perl, unless you've got a 64 bit OS, you can't address more than 4GB anyway, so your approach is limited. Also, if the bug was fixed and you have 4GB of memory, you'll push everything else into swap space, slowing your machine down a lot for no reason at all. You really should read() (edit: and write) in multiple, much smaller chunks.
| [reply] |
Re: File splitting script
by ambrus (Abbot) on Jul 24, 2008 at 00:50 UTC
|
| [reply] |
Re: File splitting script
by sgt (Deacon) on Jan 26, 2007 at 15:19 UTC
|
not an answer to the question (probably BrowserUk has found the problem) but I wonder why you use system("mkdir...") when perl has mkdir...
cheers
--stephan
| [reply] |
|
|
Should we submit this to perlbugs ?
| [reply] |
|
|
| [reply] |