file handling question

smackdab has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: file handling question by Zaxo (Archbishop) on Dec 17, 2003 at 04:25 UTC
IO::AtomicFile may be what you want, but it has nothing to do with file locking. If you look at the source, you'll see that a temporary file is written, and renamed on close. The atomic nature of rename in the OS is what accomplishes that. A weakness of this module is that the temporary file name is not safe if two instances open the same file. Both temporaries will be giwen the same name, with bad results. Voluntary or mandatory flock is another animal. It is able to handle the retry strategy you mention, or else to block, waiting to the lock to become available. I'm not sure whether Apache honors voluntary locks for reading html files, perhaps another monk knows. Some file systems and OS's are weak in handling concurrency, but most unices are ok. After Compline, Zaxo	[reply]
Re: file handling question by graff (Chancellor) on Dec 17, 2003 at 05:20 UTC
If the process that writes a given HTML file takes some noticeable amount of time between start and finish, and you want to make sure that web visitors will only see the complete form of the file, something like the following ought to be all you need: open a new file for output with a name that won't be visible to web visitors, or create it with the intended public name but in a different path (not publicly exposed) on the same disk; once the output to the file is complete and the file is closed (and you're sure there weren't any errors), rename the file to the intended path/name in the public web directory -- it now becomes instantly visible to the next person who (re)loads the url for the file, and from the visitor's perspective, it is never partial or incomplete. If the web server is a unix box, "renaming" a file from one disk to another (e.g. from /tmp to /public_html) really means copying it, which will not be instantaneous -- maybe not as slow as the process that writes the file in the first place, but still not as fast as renaming a file so that it stays on the same volume, which just involves moving an inode entry from one directory to another. (This is likely to be true on any OS, even those that don't have things called "inodes".) As for reading config files (it took me a while to get the connection between the first and second paragraph)... If you're worried that a process reading a config file might get an incomplete or "transient" version of the data -- and if this is a persistent, pernicious concern -- you might consider making up a little table (database or flat file) that stores file names with data checksums. Read the file once, compute its checksum, and if that doesn't match the checksum in the table, treat it as an error condition. (You could try reading it again after a delay, to see if the problem persists, but if it fails twice, you might as will quit.) This would require a little more infrastructure for managing your config files, to make sure that the checksum table is updated every time a file is intentionally added, deleted or altered.	[reply]
Re: Re: file handling question by simonm (Vicar) on Dec 17, 2003 at 06:43 UTC
If the process that writes a given HTML file takes some noticeable amount of time... something like the following ought to be all you need: open a new file for output ... once the output to the file is complete and the file is closed (and you're sure there weren't any errors), rename the file to the intended path/name Which, not coincidentally, is what the IO::AtomicFile module does.	[reply]
Re: Re: Re: file handling question by smackdab (Pilgrim) on Dec 17, 2003 at 07:24 UTC
Thanks!!! I think that confirms that my choice for IO::AtomicFile is a good one ;-) The second part, which I can see was a little confusing, is as follows: (I am on win32, but want to be portable) I have a server process and it often needs to read config files for instructions. I expect the file to always be readable, but one time it wasn't (I think I was in the debugger and maybe viewing the file also, no locking that I am aware of...). Instead of just throwing out the job, I figured I could retry opening the file a few times, maybe sleeping .1 sec in between? Is this just overkill? Curious, if you are manually editing a crontab file in vi and cron runs, what happens? (I don't have unix or cron...just use windows and we don't expect users to edit config files behind our backs ;-) thanks for any defensive programming ideas !	[reply]
Re: Re: Re: Re: file handling question by graff (Chancellor) on Dec 17, 2003 at 17:59 UTC