in reply to Re: Continuously polling multiple directories for file transfer?
in thread Continuously polling multiple directories for file transfer?

We found that for xml files you can check if the file parses OK - if it's mid-download it will have unclosed tags and parsing will fail. This is assuming you have a single root node per file.
  • Comment on Re^2: Continuously polling multiple directories for file transfer?

Replies are listed 'Best First'.
Re^3: Continuously polling multiple directories for file transfer?
by foobie (Initiate) on Feb 11, 2009 at 11:20 UTC
    Or you can upload a second file once the first one has completed, eg filename.complete - when that the second one appears you can assume the first one is complete. Dunno how truly atomic it is, but we had no problems on a live system handling thousands of uploads/day over several years.
      Thank you for the reply foobie. Are you suggesting something like the following, where I ls, copy the each file to the new appended filename, process each appended file, then move like this?

      #!/usr/bin/perl -w use strict; use warnings; use diagnostics; use File::Copy; $path = "/mnt/ldmdata/"; @site_array = ("karx", "kdlh", "kfsd", "kmpx", "kmvx", "kwbc"); $poll_time = 20; # sec between polls of all specified directories for (;;) { foreach $site (@site_array) { $file_dir = $path . $site; $archive_dir = $file_dir . "/archive"; mkdir "$archive_dir", 0755 unless -d "$archive_dir"; opendir(FILE, $file_dir) || die "Cannot open $file_dir"; @files = readdir(FILE); closedir(FILE); if(@files) { foreach $file (@files) { copy($file_dir . $file, $file_dir . $file . ".complete +"); pqinsert $file_dir . $file . ".complete"; move($file_dir . $file . ".complete", $archive_dir . $ +file . ".complete"); unlink($file_dir . $file); } } } sleep $poll_time; }
        Not quite - I meant just to look for the .complete file was there, so you'd upload file_2009-02-12_1234.dat and then upload file_2009-12-1234.dat.complete once the first upload was done. You may even want to put a small amount of metadata in the .complete file, an md5 of the real file, for example. So the upload is something like
        my $id = '2009-02-12-1234'; my $filename = "file-$id.dat"; my $local_file = "$src_dir/$filename"; # upload 'real' file $ftp->put ( $local_file, $filename ) || die "can't upload: $!"; # it completed ok - upload the completion marker $ftp->put ( $dummy_file, "$filename.complete" );
        And the watcher:
        foreach my $file (@files) { # Only look for the 'upload complete' marker files next unless $file =~ /^(.*)\.complete$/; # extract the name of the 'real' file my $base_name = $1; # insert into db pqinsert( $file_dir . $base_name ); # move real file into archive dir move( $file_dir . $base_name, $archive_dir . $base_name ) || die "can't move: to $archive_dir - $!"; # remove the 'complete' marker file unlink($file) || die "can't unlink $file - $!" }
        HTH!