Re: efficient ftp of numerous files
by TomDLux (Vicar) on Mar 09, 2004 at 03:55 UTC
|
You want to fetch recently changed files, to update a local repository.
I believe rsync is suitable for the task.
--
TTTATCGGTCGTTATATAGATGTTTGCA
| [reply] |
|
|
Agreed. While you could write something in Perl to solve this problem, there's no point in re-inventing the wheel. if you don't have it on your system it's available from rsync.samba.org.
On the other hand, it sounds like you may not be creating a mirror of the entire repository of pictures, just making a local mirror of a picture set whose members change nightly. If the latter is the case, there's a few fun ways to approach this:
- If you have shell access to your remote site, perhaps you could write a script that created a directory of symbolic links to the files you want to download. You could then run the script nightly and then use rsync to create a local mirror of the remote directory.
- If you don't have shell access to the remote site, then you could make the hash like you describe above.
Good luck with it. :)
| [reply] |
Re: efficient ftp of numerous files
by pzbagel (Chaplain) on Mar 09, 2004 at 02:49 UTC
|
How about populating your hash with the contents of your local directories keyed off the 3 digit directory. Then as you get the listing from ftp you can iterate over it and check what is in your hash (aka what you have locally). Perhaps File::Find might be of service.
347
|-> 12345347.jpg
-> 80780347.jpg
879
|-> 34392879.jpg
-> 37329879.jpg
and so on
| [reply] |
Re: efficient ftp of numerous files
by Hena (Friar) on Mar 09, 2004 at 09:20 UTC
|
Also i would check out Mirror. That is a mirroring program written in perl. But rsync as suggested by TomDLux should also do the trick. Last but not least could be a wget program (mirroring tool for http and ftp by fsf). | [reply] |
Re: efficient ftp of numerous files
by iguanodon (Priest) on Mar 09, 2004 at 03:13 UTC
|
The easiest way would be to just add a column to your table to indicate a file has been downloaded. If you can't modify that table, you could create another one or even use a flat file to keep track of your downloads.
| [reply] |
|
|
The contents of the table are going to be recreated in whole each day as well. Meaning I am going to truncate and populate the tables prior to the ftp - so managing this information within the table probably wouldn't work. However, maybe a secondary table would do the trick.
| [reply] |
Re: efficient ftp of numerous files
by Anonymous Monk on Mar 09, 2004 at 17:44 UTC
|
I am limited to FTP only. The remote system has some 1.4 milltion picutres in it and I am only going to be grabbing a subset of those, maybe a few thousand. I will check out some of the options mentioned above. I agree, that last thing I want to do is create more work for myself by recoding something that already exists. | [reply] |
Re: efficient ftp of numerous files
by iburrell (Chaplain) on Mar 09, 2004 at 19:35 UTC
|
Do you know which files have changed from the database tables? For example, something in the database indicates that the item is new or modified? Then it is easy to populate a list of files to check. Are the files ever modified or are they only added (and deleted)? Then you don't need to check modified times, just if the files exists or not.
| [reply] |
|
|
The contents of the table may change, but the pictures will not. So one day one there may be 1000 database records and then the next there may be only 940, then the following day there could by 1120 with the difference all being new records.
So each day I have to delete and repopulate the table and then check to see if I have the pictures or not.
| [reply] |
|
|
| [reply] |