Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Copying Reports from Multiple Sites

by PrimeLord (Pilgrim)
on Sep 11, 2003 at 15:33 UTC ( [id://290720]=perlquestion: print w/replies, xml ) Need Help??

PrimeLord has asked for the wisdom of the Perl Monks concerning the following question:

Monks I stand before you once again seeking your wisdom. I have recently written a script that will be uised to archive several reports from multiple sites to a central location. The script seems to function well enough, but I am having a problem with part of the reporting. The script generates a report that lets me know if a report wasn't copied over and which sites it couldn't copy the reports from. The report displays the first missed report properly, but then keeps tacking on the sites from the previous reports after that. For example.

The following files were not copied over properly: report1 sitea siteb report2 sitea siteb sitea siteb report3 sitea siteb sitea siteb sitea siteb

Now I expected the report to look like this:

The following files were not copied over properly: report1 sitea siteb report2 sitea siteb report3 sitea siteb

I am having problems spotting where the problem is. I am sure I am overlookign something obvious. Also it seems like there should be an easier way to handle copying over the reports so any suggestions on just overall improving the code would be very appreciated. Here is the code.

#!/usr/bin/perl -w use strict; use Time::Local; sub _init { my $config_dir = '/config/dir'; my $config_file = 'report_copy.cfg'; my $site_file = 'sites.cfg'; my $report_dir = '/report/dir'; my ($day, $month, $year) = (localtime)[3,4,5]; my $log_date = sprintf("%04d%02d%02d", $year + 1900, $month + 1 +, $day); return $config_dir, $config_file, $site_file, $report_dir, $day +, $log_date; } sub _read_cfg { my ($config_dir, $config_file, $site_file) = @_; my (%reports, %sites); open IN, "$config_dir/$config_file" or die "Error: Could not open $config_dir/$config_file f +or reading: $!\n"; while (<IN>) { chomp; my ($name, $type, $day, $site_locations) = split; $reports{$name} = "$type:$day:$site_locations"; } close IN or warn "Error: Could not close $config_dir/$config_file +: $!\n"; open IN, "$config_dir/$site_file" or die "Error: Could not open $config_dir/$site_file for + reading: $!\n"; while (<IN>) { chomp; my ($site, $folder) = split /=/; $sites{$site} = $folder; } close IN or warn "Error: Could not close $config_dir/$site_file: +$!\n"; return \(%reports, %sites); } sub _get_reports { my ($report_dir, $day, $log_date, $reports, $sites) = @_; my (@site_array, %missed_reports); for (sort keys %$reports) { my ($type, $report_day, $site_locations) = split /:/, $r +eports->{$_}; if ($site_locations eq 'All') { for my $site (sort keys %$sites) { push @site_array, $site; } } else { @site_array = split /,/, $site_locations; } if ($type eq 'daily') { _copy_report($_, $report_dir, $log_date, $sites, +\@site_array); } elsif ($type eq 'monthly' && $report_day eq $day) { _copy_report($_, $report_dir, $log_date, $sites, +\@site_array); } else { delete $reports->{$_}; } my $file_check; my $problem = ""; for my $site_folder (@site_array) { $file_check = 0; my $folder = $sites->{$site_folder}; for my $filename (</disk2/$folder/reports/*>) { $file_check = 1 if $filename =~ /$_.*\.$lo +g_date/; } if ($file_check == 1) { next; } else { $problem .= "$site_folder "; $missed_reports{$_} = $problem; } } delete $reports->{$_}; } return \%missed_reports; } sub _copy_report { my ($report_name, $report_dir, $log_date, $sites, $site_array) += @_; for my $site (@$site_array) { my $result = `scp host.$site:$report_dir/${report_name}* +.$log_date /disk2/$sites->{$site}/reports`; } } sub _print_report { my ($missed_reports, $log_date) = @_; open OUT, ">> /disk2/reports/report_copy.$log_date" or die "Error: Could not open /disk2/reports/report_copy +.$log_date: $!\n"; print OUT "The following files were not copied over properly:\n +\n"; for (sort keys %$missed_reports) { print OUT "$_\t$missed_reports->{$_}\n"; } close OUT or warn "Error: Could not close /disk2/reports/report_co +py.$log_date: $!\n"; } my ($config_dir, $config_file, $site_file, $report_dir, $day, $log_dat +e) = _init; my ($reports, $sites) = _read_cfg($config_dir, $config_file, $site_fil +e); my $missed_reports = _get_reports($report_dir, $day, $log_date, $repor +ts, $sites); _print_report($missed_reports, $log_date);

Again any suggestions you might have would be greatly appreciated. Sorry the code is so long. Hopefully it is readable. Thanks.

-Prime

janitored by ybiC: Balanced <readmore> tags around long code block

Replies are listed 'Best First'.
Re: Copying Reports from Multiple Sites
by Abigail-II (Bishop) on Sep 11, 2003 at 15:55 UTC
    My guess is that somewhere you are pushing on an array while looping over the reports, without clearing the array when dealing with another report.

    But I ain't going to debug two pages of uncommented code, so I'm not sure.

    Abigail

Re: Copying Reports from Multiple Sites
by knexus (Hermit) on Sep 11, 2003 at 18:57 UTC
    In glancing through the code, if appears that something in or around the line in sub _get_reports()
    $problem .= "$site_folder ";

    would have to be involved. Expanding the search out from that point I see that the array  @site_array is being looped over and that $problem is being reset just before it. Hmmm, that seems Ok.

    Branching out to the next loop I see that @site_array is initialized before the outer loop (which walks thru the reports) and not always within the loop.

    I think that is the problem. The @site_array array is initialized differently depending on  $site_locations. In the 'All' case new sites are pushed onto the array leading to an accumulation of the sites and the problem you are seeing. So, I think you need to clear the array in that case.

    if ($site_locations eq 'All') { @site_array = (); #clear the array for my $site (sort keys %$sites) { push @site_array, $site; } ...

    I have not tested this, it's just my first guess. I hope it helps.

    FWIW: My litte Rant; I used to produce "Write-Only" code until I had to debug some crap I "Wrote" in the past and did not have the faintest clue what I had done or why. I comment code now! :-))

Re: Copying Reports from Multiple Sites
by TomDLux (Vicar) on Sep 12, 2003 at 00:13 UTC

    I came to the same opinion as knexus, that the problem is around

    <coded>$problem .= "$site_folder ";</code>

    But tracing how I got there shows a couple other poor coding areas, improving which might improve debugging ability.

    I began by searching the program for The following files were not copied over properly:. It turns out to be in _print_report(), followed by printing the keys and values of %$missed_reports. remeber for later, the error is associated with the values in the hash..

    In the last two lines of code, $missed_reports is passed to _print_report(), obtained from _get_reports().

    Near the bottom of _get_reports(), we find:

    $problem .= "$site_folder "; $missed_reports{$_} = $problem;

    So the question is, what are $site_folder and $_?

    The inner loop sets $site_folder as the elements of $site_array, which is populated at the top of the outer loop. And $_ is set by the outer loop.

    Pheeeww, huff, puff; that was a long trail ....

    Using $_ as the default loop control variable is occassionally convenient because it avoids one variable.... and especially when there are some operators or functions which accept the default variable.

    In this case, the default status is not used: $_ is used as an ordinary variable; that outer loop over which $_ appears is quite large---30 lines. So there is no advantage to using $_; instead, it serves as a counter-documentation, minimizing comprehensability of this variable.

    Of course, code grows. You start off with something that was going to be short and to the point, and several specification changes later it has become a behemoth. That's why I generally avoid using default variables except in things like one-line map or grep constrructs, or in tiny programs. In any program ( as opposed to short script ), I consider whether anything is gained by anonymous brevity that is a significant improvement over a self-documenting variable name.

    --
    TTTATCGGTCGTTATATAGATGTTTGCA

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://290720]
Approved by ybiC
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2024-03-29 05:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found