Re: Combining Data Files
by ikegami (Patriarch) on Mar 23, 2005 at 17:11 UTC
|
The files don't just appear. They grow as the FTP upload progresses. How do you expect to know when the files have finished uploading? Testing for their presense is not sufficient, as shown by my proof below. If you don't take this problem into account, you may end up with only a fragment of the first or second file in the combined file. The two solutions provided to date do not take this problem into account.
Proof:
| [reply] [d/l] |
|
|
This is true. There are two problems. Both files may not appear in the directory on the same day. The process needs to determine if both files are in the directory and are intact. Then I need to cat one file to the other. If one or the other file is not present, I need to suspend the process (scheduler).
| [reply] |
Re: Combining Data Files
by RazorbladeBidet (Friar) on Mar 23, 2005 at 16:29 UTC
|
| [reply] |
|
|
if [ -s $FILE1 -a -s $FILE2 ]
then
cat $FILE1 $FILE2 >$NEWFILE && rm $FILE1 $FILE2
fi
would seem to suffice by the description.
/J\ | [reply] [d/l] |
|
|
test -f file1 && test -f file2 && cat file1 file2
If the files are huge, pipe it through gzip or tar or something.
--
"This alcoholism thing, I think it's just clever propaganda produced by people who want you to buy more bottled water." -- pedestrianwolf
| [reply] [d/l] |
Re: Combining Data Files
by brian_d_foy (Abbot) on Mar 23, 2005 at 17:18 UTC
|
What does "combine" mean in this case? Is it merging, appending, or something else?
How do they arrive? Does the remote side push them at you, or are you pulling them in? Before you do anything else, you need to know when the files have finished transferring.
I'm curious how you want to decide if one file is not there, too: it won't be there until the other one finishes transferring I'm guessing, so for a short time it looks like it might not exist. Why might a file be missing? Failed transfer? Database crashed and didn't create it? Something else?
Given more information, I'm sure somebody can come up with a way that takes those things into account.
Good luck. :)
--
brian d foy <bdfoy@cpan.org>
| [reply] |
Re: Combining Data Files
by bass_warrior (Beadle) on Mar 23, 2005 at 17:46 UTC
|
I had a simular situation, but had to move large files from one server to another once the file was complete.
What I did was use File::stat and checked the mtime, if it hadn't changed in the last 10 minutes I assumed the upload was complete. | [reply] |
|
|
I can't say enough about using end of file markers to ensure that the file is intact. The 10 minute wait works great when you're on a stable network, and the two machines are nearby, but when you start getting to larger files, and systems further apart, you run the risk of the transfer dying in the middle. This is especially bad if you can't back out of whatever you're about to do if you start processing.
I've done a number of different tricks over the years. For those times when you control the format (and you can't change to SOAP, or something that's intented for this sort of thing), I make sure that all files have the last line of END (or something else that will never show up in the file)
If you can't modify the format, and can't easily test if the file is complete, I'll send two files -- if the important file (the one you actually care about) is filename, the second file is file filename.done, with no real data in it.
| [reply] [d/l] [select] |
|
|
| [reply] |
Use Files As Markers - Re: Combining Data Files
by metadoktor (Hermit) on Mar 24, 2005 at 07:56 UTC
|
Whenever you transfer a 'file1.txt' to the machine you should transfer another 'file1.sent' of size 1 byte right after 'file1.txt' has been transferred. You can then have a shell script which checks for these markers and if they show up then you can concatenate the files (if that's what you want).
if [[ -e file1.sent && -e file2.sent ]] then
cat file1.txt file2.txt > files.txt
fi
metadoktor
"The doktor is in."
| [reply] [d/l] |