A couple of things here make me nervous. The first is the ability to tell when a file is available. Now, perhaps your files will be incredibly small, so the act of copying them across the network and updating the directory structure will be basically atomic. I doubt it. So that means that a simple "-s" test is not sufficient to tell when a file is finished uploading. 1MB of 700MB may be uploaded so far, and then things can really go wonky. You'll probably want to adjust your protocol so that some atomic operation can be done by the uploader to indicate to the daemon that the file is ready. There are a few simple choices, and one more complicated choice, that I can think of here. The first one is to have the application that puts up the job file create a file with the same name, but adding ".done" to the end. This file would have no contents. But, because it is created after the main job file is done, then we know the main file is done. To do this, something like { open my $fh, '>', "$jobfile.done"; } should do it (create the file, filehandle goes out of scope, it's closed). However, this still leaves a bit of a hole - what if the server deletes the file between the open and close? There's not much time there, but I don't know what would happen. Probably can be handled if you think about it.

The second option is to upload the file to a different directory than the server parses them from. Once the copy is finished, a simple rename to the correct directory should be atomic. I can't think of a race condition here.

The third option is to move the job status into a relational database. I'm not sure if your files can go there or not, but, if not, the metadata (i.e., "job 123 ready for pickup") can be inserted by the uploader, and the server stops monitoring the filesystem, instead monitoring the database (with a less-efficient regular poll). The downsides here are many, including setting up a db if you don't already have one. But the upsides include that the db should have transactions (thus the insert is always atomic if done properly), and locking. With the locking, you can lock tables/rows from the server side such that a second server could also poll the db, should you ever find the need for jobs to scale such that they are being run on multiple machines.

The second thing I'm nervous about is simply the ability to lock with samba. This may work. It just makes me nervous. That's a lot of stuff that needs to go right - I wouldn't actually want to lock on NFS, either, so maybe I'm just paranoid. Locking in a db seems safer to me :-) (and option 2 above - renaming the file after it's finished - avoids this as well.)

I'm not sure why you copy the jobstack to a temp, manipulate the temp, and then copy it back. If you can pull the item you want from the original shared @jobstack in a small lock, that'll be way better.

my $job = do { lock (@jobstack); extract_job(\@jobstack); # this will find the next one to do, and re +move it from the list, and return it };
This saves a bunch of copying, and keeps the lock to a minimum. Locks are heavy-handed things. You want to avoid them whenever possible, and where not possible, you want to reduce their scope to a bare, bare minimum. Otherwise your other thread will block when file changes come in. Of course, in your sample code, you're not doing anything, so it's not yet a big deal, but I assume there is or will be more code in your main thread in your real code that does significantly more work, otherwise you wouldn't bother with all this :-)


In reply to Re: strange behaviour, would appreciate any comment / alternative method by Tanktalus
in thread strange behaviour, would appreciate any comment / alternative method by djamu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.