Win32 - flock() not FIFO?

hennesse has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I recently ported an application from Unix to Win32, and I've got one problem left to solve - my diagnostic logfile is now unsorted.

Five different threads all share a common "textbook" logging subroutine...

use Fcntl qw(:flock SEEK_END);

# ---------------------------------------
# sub logmsg
# ---------------------------------------
sub logmsg {
  my $now = time;
  my $msec = int( ( $now - int($now) ) * 1000 );
  my ( $sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst ) =
+ localtime(time);

  sub lock {
    flock( LOG, LOCK_EX );
    # and, in case someone appended
    # while we were waiting...
    seek( LOG, 0, SEEK_END );
  }

  sub unlock {
    flock( LOG, LOCK_UN );
  }

  open LOG, ">> $logFilename";
  lock();
  printf LOG "%02d:%02d:%02d:%03d %s\r\n", $hour, $min, $sec, $msec, @
+_;
  unlock();

  close(LOG);
}
[download]

This worked fine on freeBSD. Even though there are multiple log messages in a single millisecond, the messages were always in perfect order. It made debugging a snap (the 5 processes do a lot of IPC). Now the log looks like this (check the timestamps):

16:57:23:238    progman: sent supervisor - progman,ctrlman,set,wort_pu
+mp,off

16:57:23:240 supervisor: rcvd pidman     - pidman,webman,status,hlt_du
+ty,0

16:57:23:239    progman: sent supervisor - progman,ctrlman,set,boil_he
+ater,off

16:57:23:240    progman: sent supervisor - progman,ctrlman,set,hlt_hea
+ter,off

16:57:23:241    progman: sent supervisor - progman,ctrlman,set,hlt_ssr
+,off

16:57:23:241 supervisor: sent ctrlman    - pidman,webman,status,hlt_du
+ty,0

16:57:23:234    ctrlman: rcvd supervisor - pidman,webman,status,hlt_du
+ty,0

16:57:23:244 supervisor: sent progman    - pidman,webman,status,hlt_du
+ty,0

16:57:23:245 supervisor: sent webman     - pidman,webman,status,hlt_du
+ty,0

16:57:23:250     webman: rcvd supervisor - pidman,webman,status,hlt_du
+ty,0
16:57:23:241    progman: sent supervisor - progman,ctrlman,set,boil_ss
+r,off

16:57:23:246    progman: sent supervisor - progman,webman,status,hlt_d
+uty,0

16:57:23:246    progman: sent supervisor - progman,webman,status,hlt_s
+etpoint,0


16:57:23:246 supervisor: rcvd progman    - progman,webman,status,step,
+startup

16:57:23:249 supervisor: sent webman     - progman,webman,status,step,
+startup

16:57:23:253     webman: rcvd supervisor - progman,webman,status,step,
+startup

16:57:23:250 supervisor: sent ctrlman    - progman,webman,status,step,
+startup

16:57:23:241    ctrlman: rcvd supervisor - progman,webman,status,step,
+startup

16:57:23:247    progman: sent supervisor - progman,webman,status,hlt_m
+ode,pid

16:57:23:251    progman: sent supervisor - progman,webman,status,hlt_p
+ower,100

16:57:23:251    progman: sent supervisor - progman,webman,status,hlt_v
+ary,1
[download]

Nothing is garbled, just out of order. The timestamp is created at the entry to the logging subroutine - before the open() and flock(). It appears that locking is working, BUT - when the lock is released, the thread (or process) that gets the lock next is not FIFO - it's random (phase of the moon?).

Unsorted log file is one thing, but Aye, here's the rub - each thread must wait a random amount of time to acquire the lock. Since this application runs hardware devices like pumps and heating elements, the concept of "random" is not in our vocabulary.

I suspect this is a problem with ActiveState 5.10.1 no-ssl binary and/or Windows, but to confuse matters, about the same time I moved from Unix to Win32, I also changed from fork-exec to threads, but I did not closely examine the logfile at the time.

Any ideas? - Thanks, Dave

Comment on Win32 - flock() not FIFO? Select or Download Code

Replies are listed 'Best First'.
Re: Win32 - flock() not FIFO? by BrowserUk (Patriarch) on Oct 29, 2011 at 02:13 UTC
Using file system based locking and re-opening the file for every message is ...urm... lets just say, non-optimal. The simplest, most portable way if you are using threads is to use a Thread::Queue. `use threads; use Thread::Queue; ... my $Q = new Thread::Queue; sub logmsg { // added thread id to messages. $Q->enqueue( threads->tid . ": @_"; ); } async { open my $log, '>', 'logfile' or die $!; while( $Q->dequeue ) { my $now = time; my $msec = int( ( $now - int($now) ) * 1000 ); my ( $sec, $min, $hour ) = localtime(time); sprintf $log "%02d:%02d:%02d:%03d %s\r\n", $hour, $min, $sec, +$msec, $_; } close $log; } ... logmsg( ... ); ... $Q->enqueue( undef ); // to terminate logging thread.` [download] There is no need for file locking as all the writing is performed by a single thread. The queue takes care of itself. Loggin causes minimal delays in the critical code path. Queuing a message is very fast as all the formatting and IO is done in another thread. The ordering of messages is preserved. Queues are ordered. Simple, effective and very efficient. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: Win32 - flock() not FIFO? by hennesse (Beadle) on Oct 29, 2011 at 19:18 UTC
BrowserUK, Thread Queues is a perfect solution to the logfile problem. When you suggested using Thread Queues for my IPC before, I looked into them, but mistakenly thought they were like pipes – one thread to one other thread. Now that I see the error of my ways, I’m considering using them for my IPC (or is it ITC?), but there’s one problem. Three of my threads just sit around waiting for a datagram to arrive from another thread using select(,,,undef). The other two have to break out of the select every x seconds to do some housekeeping, so they use select(,,,timeout). Unfortunately, dequeue() doesn’t have a timeout feature. I don’t see any way to use dequeue_nb since during periods of inactivity, it’d have to spin in a tight loop. Any ideas? Thanks - Dave	[reply]
Re^3: Win32 - flock() not FIFO? by BrowserUk (Patriarch) on Oct 30, 2011 at 02:02 UTC
Three of my threads just sit around waiting for a datagram to arrive from another thread using select(,,,undef). The other two have to break out of the select every x seconds to do some housekeeping, so they use select(,,,timeout). Unfortunately, dequeue() doesn’t have a timeout feature. I don’t see any way to use dequeue_nb since during periods of inactivity, it’d have to spin in a tight loop.... Any ideas? In general (and idealistically):Don't poll! The really nice thing about threading is that you don't have to try and make one piece of code do more than one thing at a time. If you have two things to do -- eg. 1) wait for a message from somewhere; 2) periodically do some housekeeping; -- use two threads. `async { while( $Q->dequeue() ) { ## process message. } }; async { until( $dieNow ) { sleep $someTime; ## do housekeeping. } };` [download] In this way, you separate the concerns of the two. Each is free to get on with exactly what it needs to do at its earliest opportunity -- ie. its next time-slice --, and continue that single task to completion without ever concerning itself that it might be taking too long or the need to check what else needs to be done. That there in a nutshell is what makes kernel threading superior to event-driven or cooperative time-sharing mechanisms. Anyone who has ever done a job that required concentration, and that they answer a phone in a timely fashion, will get this point completely. It can be done, but each task complicates, interferes with and detracts from the performance of the other. `</end-idealism>` If you were designing your application from scratch, I would definitely suggest that you remove the periodic housekeeping duties from the two 'wait-for-datagram' threads. And move them, if possible combining them, into their own thread who purpose is purely periodic housekeeping. Given that you are currently restructuring your application to use threading, I also strongly suggest that you at least consider this possibility at this point. But, without having a good oversight of what your application is doing, or how it is currently structured and therefore what impact this type of re-structuring is likely to have, I have to realise that it may not be optimal of your time at this stage to do that. So, you currently have something like: `while( my( $nfound, $timeleft) = select( ..., $timeout ) ) { if( $timeleft ) { ## Nothing doing, do some housekeeping } else { ## deal with the event } }` [download] Substitute: `use Time::HiRes qw[ sleep ]; ... until( $dieNow ) { sleep( $timeout / 10 ); if( my $dgram = $Q->dequeue_nb() ) { ## Act on $dgram } else { ## Do housekeeping (if required). } }` [download] If your original timeout is 1 second, waking up every 1/10th of a second will have negligible affect on your CPU usage whilst ensuring that you process datagrams and housekeeping in a timely fashion. Indeed, you'd have to leave this: `async{ 1 while Win32::Sleep( 100 ); }->join;;` [download] Running for a whole day to really notice that it had consumed any CPU at all. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^4: Win32 - flock() not FIFO? by hennesse (Beadle) on Oct 30, 2011 at 15:45 UTC
Re^5: Win32 - flock() not FIFO? by BrowserUk (Patriarch) on Oct 30, 2011 at 17:05 UTC
Some notes below your chosen depth have not been shown here
Re: Win32 - flock() not FIFO? by Anonymous Monk on Oct 29, 2011 at 02:21 UTC
that gets the lock next is not FIFO Why do you think it should be FIFO? I've never heard of a flock-ing order Any ideas? write a log viewer, one that doesn't care about the physical order (or disorder) of the entries (log process-ids and thread-ids not just timestamps) switch to a logging framework that takes care of all this for you	[reply]