OK, so you've got a system which is doing message passing between servers based on IP, where both latency and reliability are important.

Implementing reliability reliably (hah!) is pretty hard. How well are you covering against power failure (you're fsync'ing file descriptors, right? Are you flushing stdio buffers beforehand?) You're triggering off of the existence of files - do you have race conditions where a file is created (empty) and might be processed before it's filled in? You're processing collections of related files. Are they created in a specified order?

I'd seriously consider using existing software for this. In particular, either find a message passing module/library of even consider good old store-and-forward SMTP. Any grown-up mail system will have the reliability thing sorted and also not have any issues about one failing delivery stalling the entire queue. You might need to tune connection timeouts and retry timeouts though.

If you don't like this idea and want/need to write it yourself, you're either going to have to get threaded, multi-process or get asynchronous. All of these solutions will add a lot of complexity to your setup and your best chance is to pick the one you understand best.

If it were me and I had to implement this sort of thing then I might go for an async, event-based system and a simple UDP protocol. The events are then as simple as:

  1. New file to process (add new memory record, send UDP msg)
  2. UDP ACK received (clear memory record, log delivery)
  3. Timed event: UDP response not received (inc retry counter in mem record, retry UDP send or discard+log)
  4. Timed event: poll for new files (OR use system filesystems to generate notification events)
It shouldn't get much more complex than that, but you will be doing async programming, so you can get bitten by some things taking unexpectedly longer than you think sometimes (e.g. DNS queries).

You'll also have to think about how much state you need to save on shutdown/restore and if you need to sync it to prevent against uncontrolled shutdown (power loss).

If you want to do multiprocess, cpan throws up "Parallel::ForkManager" as a possibilty. That might help. The things which make multiprocess painful include sharing information after process creation time, beyond the child exit code. In your case, you might get away with a simple succeed/fail exit code for each delivery which might make things quite simple.


In reply to Re^3: Fastest way to determine whether a specific port is open via a Socket by jbert
in thread Fastest way to determine whether a specific port is open via a Socket by avo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.