markuswinter has asked for the wisdom of the Perl Monks concerning the following question:
Hi all,
sorry, total newbie question:
We have a script which gets an array of URLs (@taxonomy_file_url) and downloads the new files:
foreach (@taxonomy_file_url) { &getFile($_, "taxonomy", $local_taxonomy_directory); }
However there is a problem: the script does not delete the old file first, so the newly downloaded file becomes taxonomyFile1, then taxonomyFile2, then taxonomyFile3 etc etc
As the main app ONLY looks for taxonomyFile it will never see the new file.
So I want to delete the old file (if it exists):
foreach (@taxonomy_file_url) { if (-e $_) { # does the file exist unlink($_) #delete it } &getFile($_, "taxonomy", $local_taxonomy_directory); }
Is this the correct way of doing it?
Thanks for any advice you have!
Markus
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Correct code for deleting old file?
by flexvault (Monsignor) on Jun 15, 2012 at 10:00 UTC | |
Welcome markuswinter, I have modified your code to my style, but the one difference is that I have replaced the '$_' with '$file'. Using '$_' in a one-liner is fine, but when you use it multiple times, you may be changing the value of '$_'. I did not test your code, so it may work fine. Since you are new to Perl, I'm just suggesting an alternate way to do this. I have changed this style several times in the past 15 years and all because when you have a multi-thousand line script and the compiler tells you that your 'Missing a right curly...' you start looking for ways to improve your style. YMMV.
You don't show 'getFile', but it must be testing for the existence of the file, so you might just remove that code since an 'open' with '>' will over-write the original file. Good Luck "Well done is better than well said." - Benjamin Franklin | [reply] [d/l] |
by markuswinter (Novice) on Jun 15, 2012 at 10:49 UTC | |
| [reply] [d/l] |
by afoken (Chancellor) on Jun 15, 2012 at 20:18 UTC | |
Your code contains lot of bad practices: Also, you seem to go through great pains to handle URLs and to issue HTTP and FTP requests using wget. Perl has LWP, so you do not have to mess with the shell and external programs at all. Look first at LWP::Simple, that should nearly be sufficient to replace wget. If you need more control, look at LWP::UserAgent. As you seem to write a spider, also have a look at LWP::RobotUA. Perl also can decompress gzip files and unpack tar files, if you like, even both in one package, Archive::Tar, and of course without having to mess with the shell. For URL handling, look at URI and its helpers. And for my personal taste, the getFile() function is about 200 lines longer than it should be. Too much code for one function, too deeply nested, too many variables. As a rule of thumb, a function should not exceed one screen, because otherwise, you will get lost in the code. At my university, a screen was defined as a VT420, i.e. 25 lines of no more than 80 characters, but in the real world, my editor shows about 40 lines of about 110 characters. Consider splitting the function into smaller parts. Alexander
-- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-) | [reply] [d/l] [select] |
by Anonymous Monk on Jun 19, 2012 at 08:18 UTC | |
by flexvault (Monsignor) on Jun 15, 2012 at 13:52 UTC | |
I took a quick look at the code, and I think your original solution to delete the file may be the better choice. There is still a lot of code that isn't defined and probably compiled in by a 'use' statement. Can you test the script in a non-destructive way? Usually on PM we try to give hints rather than working code. In your case, it sounds like production code that you want to modify. I strongly suggest you set-up a 'sand box' (testing environment) before you modify production code. If I'm misunderstanding your problem, please correct me. Thank you "Well done is better than well said." - Benjamin Franklin | [reply] |
by GrandFather (Saint) on Jun 18, 2012 at 00:46 UTC | |
Anything that relies on Mk I eyeball to ensure braces match is doomed to fail, or at least consume to a great deal of time. It is much better to use a syntax highlighting editor with smart indentation. However any such assertion will raise someone's hackles. At the end of the day it generally comes down to what works for you. Note though that often "what works for you" is really "what works of the team" or "what works for the community". In the case of Perl "what works for the community" is K&R indentation with a four space indent. Especially for Perl (due to the excellent Perl::Tidy) a good answer can be to use a pretty printer to normalise code for "public" consumption, and even to renormalise code for personal use.
True laziness is hard work
| [reply] |
|
Re: Correct code for deleting old file?
by kcott (Archbishop) on Jun 15, 2012 at 10:07 UTC | |
That looks like you're trying to unlink $some_url, rather than unlink $some_file. I suspect unlink should probably be in getFile(). Please show the code for that subroutine. -- Ken | [reply] [d/l] [select] |
by markuswinter (Novice) on Jun 15, 2012 at 10:53 UTC | |
<Homer Simpson moment>Doh!</Homer Simpson moment> You are right of course. So much for my perl skills :-P The code of the subroutine is posted above in answer to another reply. Thanks! | [reply] |
|
Re: Correct code for deleting old file?
by rovf (Priest) on Jun 15, 2012 at 10:57 UTC | |
If you don't check the result of unlink (i.e. whether it works or not), you can equally well omit the test for existence. Hence, I would write it either as or as
-- Ronald Fischer <ynnor@mm.st> | [reply] [d/l] [select] |
by thirdm (Sexton) on Jun 15, 2012 at 20:04 UTC | |
| [reply] |
by cavac (Prior) on Jun 17, 2012 at 21:35 UTC | |
I'm not sure about this. My information may be a bit out of date. Calling unlink on some non-file types (directory or similar) still has "unspecified" consequences on some operating systems, hasn't it? And -e only checks for existence, whereas -f would check specifically for a file, making the call to unlink safer, right?
"You have reached the Monastery. All our helpdesk monks are busy at the moment. Please press "1" to instantly donate 10 currency units for a good cause or press "2" to hang up. Or you can dial "12" to get connected directly to second level support."
| [reply] [d/l] [select] |
by rovf (Priest) on Jun 18, 2012 at 08:16 UTC | |
Basically i agree with you. I had assumed that you from your knowledge of your application, you know for sure that the names which you had collected in your file list, certainly contain plain files (and not directories, symbolic links or what else). In this case (only) it makes sense to omit the file testing. If this is not the case, I personally would be interested *which* non-file entries made it into the file list, because this (possibly) signals an error. In this case, I would do a -f test for each file, but I would also issue an error message for those entries which are not plain files.
-- Ronald Fischer <ynnor@mm.st> | [reply] [d/l] [select] |
by cavac (Prior) on Jun 18, 2012 at 17:29 UTC | |
|
Re: Correct code for deleting old file?
by morgon (Priest) on Jun 15, 2012 at 16:07 UTC | |
I would not delete the old file but rename it (or move it somewhere else) so that you can fall back on it. You can rename or more a file with the move-function from File::Copy. | [reply] |
|
Re: Correct code for deleting old file?
by CountZero (Bishop) on Jun 16, 2012 at 06:36 UTC | |
As someone already said: much better to check for the existence of the file inside the getFile subroutine and handle that situation then and there. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James My blog: Imperial Deltronics | [reply] [d/l] [select] |