A while back I wrote an FTP engine in Perl for my company's e-commerce group. The main goals were to make a system with good session logging, keep a history of file names and download times, separate code from logon credentials and filenames, and to be able to run multiple sessions simultaneously.

The modules I had available in addition to core modules were Net::FTP, Parallel::ForkManager (thank god), and IO::Scalar. There was no procedure for adding new modules to the production system, so I threw together a script using Net::FTP for the comm logic, Parallel::ForkManager for the multiple simultaneous session, and IO::Scalar to grab STDIN and STDERR from each fork to create my session logs.

My biggest problem turned out to be the session history. I had a hash of hashes, where each key was the name of a session profile, and its keys described the session history. I ended up using Data::Dumper to dump the hash to a file, a sample of which looks like this:

$VAR1 = { 'Session 1' => { 'last' => 1080139082, 'files' => [ 'File1.txt 1079948073 @1079966101', 'File2.txt 1080035083 @1080053101', 'File3.txt 1080121051 @1080139081' ], 'lastfailmsg' => '+', 'lastfailtime' => 0 }, 'Session 2' => { 'last' => 1080129127, 'files' => [ 'File1.txt @1079956803', 'File2.txt @1080043204', 'File3.txt @1080129100' ], 'lastfailmsg' => '+', 'lastfailtime' => 0 } };

With a little fiddling, I was able to use do(file) to read the hash back into memory. I was unhappy with this setup, but managed to make it work. Since each session was its own fork, I had a logistics problem in updating the history file. After a session finished, I copied the hash tree for just that session to a temporary hash, reloaded the history file into memory, merged the changes, and dumped the new hash back to file. I used some simple file locking to battle race conditions. It works well, despite my apprehension.

Thankfully, a procedure is now in place to install CPAN modules in the production environment. So now I want to replace the following code:

sub merge_hist_changes { # Merges current session's history changes + into %sess_hist. ## Step 1 - Create copy of $session's hash tree my %temp_hash = %{$sess_hist{$session}}; ## Step 2 - Filter out files downloaded more than $hist_days days +ago my $old = time - ( 86400 * $hist_days ); # 86400 seconds in a d +ay @{$temp_hash{files}} = grep { /@(\d+)$/; $1 > $old } @{$temp_hash{ +files}}; @{$temp_hash{uploads}} = grep { /@(\d+)$/; $1 > $old } @{$temp_has +h{uploads}}; ## Step 3 - Get an flock on $hist_file.l. This is the critical ste +p that prevents ## other forks from updating until the current $session's info get +s updated. open HFL, '>', "$hist_file.l"; unless(flock HFL, 2) { my $failstr = "Can't get lock on $hist_file.l, changes to DB u +nsaved\n"; $failstr .= "History tree for $session :\n" . Dumper \%temp_ha +sh; &pager_alert($failstr); exit; } ## Step 4 - Get new %sess_hist from disk (like &parse_hist) local $/ = undef; if (-s $hist_file) { unless( %sess_hist = %{do($hist_file)} ) { my $failstr = "Can't parse history file, changes to DB uns +aved\n"; $failstr .= "History tree for $session :\n" . Dumper \%tem +p_hash; &pager_alert($failstr); exit; } } ## Step 5 - Change $session's hash pointer to refer to %temp_hash $sess_hist{$session} = \%temp_hash; ## Step 6 - Dump %sess_hist. local $Data::Dumper::Indent = 1; open HF, '>', $hist_file; print HF Dumper \%sess_hist; close HF; close HFL; # Releases flock and lets next child process update $h +ist_file

...with a system that is more reliable. Currently, if a single write error occurs, I could potentially lose a day's worth of history information.

Right now I'm just looking at possible solutions. Storable and Tie::TwoLevelHash are options, as is restructuring the hash into something that could fit better into database tables and using DBI or something similar. What approach would you guys take?


In reply to Replacing Data::Dumper / do(file) on multi-fork process by delirium

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.