footpad has asked for the wisdom of the Perl Monks concerning the following question:

Here's a stripped down CGI script I'm working on. It works fine on a BSD box and will be deployed to a Solaris box (running Perl 5.6). Since I've never worked with Solaris, are there things I should be wary of?
#!/usr/bin/perl -wT use CGI qw( :all ); use strict; $|++; print header(), start_html( "Counter Test" ); # *snip* my $datapath = "/home/username/data/project"; # outside httpd directo +ry my $datafile = "$datapath/counter.dat"; # Permissions: 666 my $subID = sprintf( "%06d", getNewCounter( $datafile ) ); # more snipped, including code verification of directory, file, and at +tributes. print "Get Counter Results: $subID", end_html; sub getNewCounter # --------------------------------------------------------------- # Reads and incremements the counter value. # Credit: Implementation based on sample found in the perl CGI # FAQ at www.perl.com. # --------------------------------------------------------------- { my( $countfile ) = @_; open( FILE, "$countfile" ) or die "Cannot open counter for reading +.\n"; flock( FILE, 2 ) or die "Cannot lock counter for reading +.\n"; my( $countval ) = <FILE>; chomp( $countval ); flock( FILE, 8 ) or die "Cannot unlock counter after rea +ding.\n"; close( FILE ) or die "Cannot close counter after read +ing.\n"; open( FILE, ">$countfile" ) or die "Cannot open counter for writing +.\n"; flock( FILE, 2 ) or die "Cannot lock counter for writing +.\n"; print FILE ++$countval; flock( FILE, 8 ) or die "Cannot unlock counter after wri +ting.\n"; close( FILE ) or die "Cannot close counter after writ +ing.\n"; return( $countval ) }

I'm primarily worried about the getNewCounter() subroutine and, as always, any security issues. In case you didn't catch the earlier comments, data/ is outside of httpd/ and cgi-bin/, as outlined in Organization Redux.

Also, any other concerns or suggestions?

--f

Replies are listed 'Best First'.
Re (tilly) 1: Flock Feedback
by tilly (Archbishop) on Feb 25, 2001 at 03:05 UTC
    First of all I would suggest moving the information on where those directories are into a configuration file. You will (presumably) have a number of CGI scripts and you don't want to have the same information included in every one of them. (Alternately you could have a makefile for your scripts that inserts that information in each of them.)

    Secondly your flock makes all of the usual mistakes. When you close it you have no guarantee that when you turn around you will be next to get the file. What is likely to happen sooner or later is that you read, someone else is waiting to read, you close, it gets the lock and reads, closes, you write, it writes. Thereby missing a count. What is worse is if you manage to open for write (wiping out the file) before it reads.

    If you must flock the file you are writing, the right way to do it is a non-destructive open for append or read, flock for writing, seek to the beginning, read, seek back, truncate, write, close (without unlocking). The simpler (but still safe) way is to have a sentinel (eg "counter.lock") that you open, flock, access the other file, then close. (Don't ever delete this!)

    For a rather heavy version of the sentinel file approach see Simple Locking.

      First off, the config stuff is already externalized. I chose not to include that code because I didn't want to confuse the actual question. (Doing my homework and providing the simplest case. :} )

      I think I see your point, but am not sure I followed the solution. I think you said, instead of opening/flocking/closing the data file twice, do something like this:

      • Try to flock the sentinel.
      • If it succeeds, open, read, edit, and close the data file.
      • Release the sentinel.
      • If the original flock failed, sleep for a period of time and then retry (up to a preset number of times, failing if the retries run out).

      Is that right? (I'm just making sure I understand the basic logic before I dive into the code.)

      -f

      Update: Okay; I think I get it. Two follow ups:

      • In your example, where is FH pointing? The data file or the sentinel?
      • Kanji offered File::CounterFile. Any thoughts/feedback?
        Close. Your traditional flock will either be blocking or non-blocking. If you only set the LOCK_EX flag, it will block by default, therefore when you try to get the lock it will wait indefinitely.

        If this is not what you want, then you should set the LOCK_NB flag as well, and poll at a specified interval a specified number of times before giving up. (You can set an alarm, but Perl's signal handling is not very good so I would not recommend doing that.)

        The portable way to get those constants is to

        use Fcntl qw(:flock); # time passes foreach (1..10) { $is_locked = flock(FH, LOCK_EX | LOCK_NB); last if $is_locked; ### Or gets both flags sleep 1; } # Test $is_locked here
        See the longer example for the full semantics of how to do the open, etc. (There I am locking a sentinel...)

        UPDATE
        In the above FH is a filehandle that has already been opened. As for the other question, yeah that module will solve the problem. But everything that I said about locking will also hold when you want to start playing games with reading and writing other kinds of data files as well...

        UPDATE 2
        OK, if you want to do this explicitly for a sentinel, here is some untested code:

        use Carp; use Fcntl qw(:flock); sub open_sentinel { my $lock_file = shift; my $fh = do {local *FOO}; # Old trick open ($fh, "> $lock_file") or die "Cannot write $lock_file: $!"; my $is_locked; foreach (1..10) { $is_locked = flock($fh, LOCK_EX | LOCK_NB); last if $is_locked; select (undef, undef, undef, 0.01); } return $is_locked ? $fh : undef; }
        This will wipe out the file (hence it is only useful to use on a sentinel file). To get the lock just try to grab. The variable will only be defined if you got the lock, and the lock is freed when the variable is destroyed. (So you can enter a routine, ask for a lock, if you get it then proceed, if you don't then exit immediately.)

        The above example tries 10 times, one hundredth of a second apart.

Re: Flock Feedback
by footpad (Abbot) on Feb 26, 2001 at 09:37 UTC
    At the risk of belaboring the points, here's v0.02:
    sub getNewCounter # --------------------------------------------------------------- # Reads and incremements the counter value. Implementation based # on wisdom gained from perlmonks.org and the Perl Cookbook. # --------------------------------------------------------------- { my( $datafile ) = @_; my( $lockfile ); # Determine the name of the lock file $_ = $datafile; s/\.dat/\.lck/; $lockfile = $_; # The idea here is to wait up to ten seconds to gain access to # the lock file. If that fails, well, try again later. my( $retries ) = 10; open( LOCK, "$lockfile" ) or die "Cannot open lock file for readin +g. $!"; while ( $retries ) { if ( flock( LOCK, LOCK_EX | LOCK_NB ) ) # if Fcntl isn't availa +ble (it should be), try 2 | 4. { last; } else { $retries--; sleep( 1 ) ; } } # How did we get out of that loop? If we failed to flock, then fle +e unless ( $retries ) { die "Cannot lock counter for reading. $!"; } # Okay, we're assuming here... open( FILE, "+< $datafile" ) or die "Can't open data file for readi +ng. $!"; my( $countval ) = <FILE>; seek( FILE, 0, 0 ) or die "Can't reset data file for writ +ing. $!"; truncate( FILE, 0 ) or die "Can't clear data file for writ +ing. $!"; print FILE ++$countval; close( FILE ) or die "Cannot close counter after wri +ting. $!"; close( LOCK ) or die "Cannot close lock file. $!"; return( $countval ); }

    Yes, it assumes that the sentinel file already exists.

    Better? (I ask because I may not be able to get the SecuroTroopers to install File::CounterFile.)

    --f

    P.S. Thanks to jptxs for picking up on the other question. Also, tilly's newest sub looks interesting and that will go into 0.03, should anything else?

    Update: Thanks, tilly for your extreme patience in this. Point by Point:

    1. .dat is fine for this script, but good advice for future ones.
    2. Added LOCK_NB, per advice; see above.
    3. I was under the impression that one second was the finest granularity available for sleep()?
    4. You mean, like the 0.01 did?
    5. *blush*
    6. /me hies himself to a Monastery library...

    Update #2: Quickly fixed the || typo and the magic numbers before running off to write v0.03.

              if ( flock( LOCK, 2 || 4 ) ) Down with magic numbers! This code is unnecessarily platform dependent. Use the Fcntl module to get the appropriate constants (as shown in the documentation for flock). (BTW, I'm assuming that the logical instead of bitwise or is just a typo.)
      use Fcntl ':flock'; # ... if ( flock( LOCK, LOCK_EX | LOCK_NB ) ) # ...
      P.S. Regarding sleeping... How can I sleep() or alarm() for under a second?
      A series of comments:
      1. I would not assume that the data file had a name ending in .dat. That is the kind of assumption that can change, and then people will be unhappy because their data file got wiped out.
      2. You need to set the LOCK_NB flag as well as LOCK_EX or else your first try will hang until you get the lock.
      3. In a CGI environment, waiting 1 second per retry is probably a bit much...
      4. While you have control of the file, you don't need to be cautious about truncates and seeks. Just open for read, read, close, open for write, and write as usual.
      5. Explicitly closing the data file is a good idea.
      6. However I like calling local on *FH if I will use FH as a filehandle. Just a good practice...

      UPDATE
      Others have already said everything that I would have to say other than the fact that my fifth point is that you actually did. I offered you code earlier where the lock went away just by going out of scope. If you use that then you still want to be explicit about closing the date file. I was just saying that it was subtle but important point.

          flock( LOCK, 2 || 4 ) Aside from the problem of using magic numbers, this really isn't going to do what you want it to.

      Meditate for a moment on the difference between "|" and "||".

      (Update: the code this note was based on has been fixed above.)

      If you just want a simple incrementing counter, one way to do this is to use the size of the file as the counter, rather than its contents. You don't need locking to do this:

      my $filename = '/path/to/file'; # Increment the counter (atomically, so no locking required) open(FILE, ">> $filename") or warn "Missed a count"; print FILE '.' # Ouput one (arbitrary) byte close FILE; # What is the counter set to? my $countval = -s $filename || 0;

      The disadvantage of this technique is that it uses one byte of disk space for each count.

        Not only does this use one byte of disk space per count, which is ridiculously wasteful, but you still have a race condition if two processes try to write to the file and then check its size at the same time. Even with this approach you still would need locking.
(jptxs)Re: Flock Feedback
by jptxs (Curate) on Feb 25, 2001 at 07:03 UTC
    I read this as "let me know any special issues with writing on BSD and deploying on Sun." I do this all the time, many different versions of both OS's and I have yet to have a problem. I have had plenty of the problems tilly points out, but that's another story =)
    "A man's maturity -- consists in having found again the seriousness one had as a child, at play." --Nietzsche