ovedpo15 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks!,
I have the following subroutine:
my $DEBUG_FILE = "log"; # Global sub debug { my ($t) = @_; return 0 unless(defined($DEBUG_FILE)); open my $fh, ">>",$DEBUG_FILE or return 0; my $prefix = "[".get_date()."][".basename($0)."](".$$."):"; print $fh $prefix.$t."\n"; close ($fh); return 1; }
This subroutine gets a string and logs it into a debug file. This file is for the user to see the progress/errors of using my tool.
I have a few steps in the tool and each one (**could**) take some time. For example, one of the steps, takes an array of paths @paths and do some operation on each path. The operation takes some time so the whole step is long. I want to notify the user of the progress of this step.
In my head I have the following idea: Update the latest line of the debug file with the progress bar. The progress bar would look something like:
[07/10/21 17:56:57][run.pl](211653):<some info example1> [07/10/21 17:57:57][run.pl](211653):<some info example2> [07/10/21 17:58:57][run.pl](211653):<some info example3> [07/10/21 17:59:57][run.pl](211653):Running operation <operation>. Pro +gress: 57%.
Then it will update it to be:
[07/10/21 17:56:57][run.pl](211653):<some info example1> [07/10/21 17:57:57][run.pl](211653):<some info example2> [07/10/21 17:58:57][run.pl](211653):<some info example3> [07/10/21 18:05:57][run.pl](211653):Running operation <operation>. Pro +gress: 58%.
Note that it won't add a new line, it will update the old line.
Also the progress percentage is calculated with 100*$handled_paths/$paths_amount where $paths_amount=scalar(@paths) and $handled_paths is the current handled path.
I think that I could just go to the debug file, and just update the latest line each time - but is it the recommended way to do this?

Replies are listed 'Best First'.
Re: Recommended way to display progress
by kcott (Archbishop) on Jul 11, 2021 at 01:19 UTC

    G'day ovedpo15,

    Three reasons why I wouldn't recommend this approach:

    • You are effectively deleting information from a log. Except for appending information to a log, I would consider it to be very bad practice to modify a log's contents in any other manner.
    • Your progress is based on the number of paths processed. Information about how long each took to process (i.e. from the timestamps) is lost.
    • You will run into all sorts of problems if you try to `tail -f logfile` when its contents are in flux.

    Having said that, if you want to proceed with this — perhaps you just want snapshots of the log instead of continual progress updates — you could use the core module Tie::File. Here's an example (with show_log_to_date() simulating snapshots):

    #!/usr/bin/env perl use strict; use warnings; use Tie::File; my $DEBUG_FILE = 'pm_11134904_log_progress.log'; tie my(@DEBUG_RECORDS), 'Tie::File', $DEBUG_FILE or die "Can't tie $DEBUG_FILE: $!"; debug("$0 started."); for my $step ('A' .. 'C') { debug("Start step $step"); if ($step eq 'B') { my @paths = 1 .. 5; my $overwrite_last = 0; for my $path (@paths) { sleep 1; my $progress = 100 * $path / @paths; my $msg = "Processing step $step Progress: $progress%"; debug($msg, $overwrite_last++); show_log_to_date(); } } else { debug("Processing step $step ..."); } debug("End step $step"); } debug("$0 completed."); untie @DEBUG_RECORDS; show_log_to_date(); sub debug { my ($msg, $overwrite_last) = @_; my $log_entry = join ' ', "".localtime, $msg; pop @DEBUG_RECORDS if $overwrite_last; push @DEBUG_RECORDS, $log_entry; return; } sub show_log_to_date { print '-' x 40, "\n"; system cat => $DEBUG_FILE; }

    Output from a sample run:

    $ ./pm_11134904_log_progress.pl ---------------------------------------- Sun Jul 11 10:20:39 2021 ./pm_11134904_log_progress.pl started. Sun Jul 11 10:20:39 2021 Start step A Sun Jul 11 10:20:39 2021 Processing step A ... Sun Jul 11 10:20:39 2021 End step A Sun Jul 11 10:20:39 2021 Start step B Sun Jul 11 10:20:40 2021 Processing step B Progress: 20% ---------------------------------------- Sun Jul 11 10:20:39 2021 ./pm_11134904_log_progress.pl started. Sun Jul 11 10:20:39 2021 Start step A Sun Jul 11 10:20:39 2021 Processing step A ... Sun Jul 11 10:20:39 2021 End step A Sun Jul 11 10:20:39 2021 Start step B Sun Jul 11 10:20:41 2021 Processing step B Progress: 40% ---------------------------------------- Sun Jul 11 10:20:39 2021 ./pm_11134904_log_progress.pl started. Sun Jul 11 10:20:39 2021 Start step A Sun Jul 11 10:20:39 2021 Processing step A ... Sun Jul 11 10:20:39 2021 End step A Sun Jul 11 10:20:39 2021 Start step B Sun Jul 11 10:20:42 2021 Processing step B Progress: 60% ---------------------------------------- Sun Jul 11 10:20:39 2021 ./pm_11134904_log_progress.pl started. Sun Jul 11 10:20:39 2021 Start step A Sun Jul 11 10:20:39 2021 Processing step A ... Sun Jul 11 10:20:39 2021 End step A Sun Jul 11 10:20:39 2021 Start step B Sun Jul 11 10:20:43 2021 Processing step B Progress: 80% ---------------------------------------- Sun Jul 11 10:20:39 2021 ./pm_11134904_log_progress.pl started. Sun Jul 11 10:20:39 2021 Start step A Sun Jul 11 10:20:39 2021 Processing step A ... Sun Jul 11 10:20:39 2021 End step A Sun Jul 11 10:20:39 2021 Start step B Sun Jul 11 10:20:44 2021 Processing step B Progress: 100% ---------------------------------------- Sun Jul 11 10:20:39 2021 ./pm_11134904_log_progress.pl started. Sun Jul 11 10:20:39 2021 Start step A Sun Jul 11 10:20:39 2021 Processing step A ... Sun Jul 11 10:20:39 2021 End step A Sun Jul 11 10:20:39 2021 Start step B Sun Jul 11 10:20:44 2021 Processing step B Progress: 100% Sun Jul 11 10:20:44 2021 End step B Sun Jul 11 10:20:44 2021 Start step C Sun Jul 11 10:20:44 2021 Processing step C ... Sun Jul 11 10:20:44 2021 End step C Sun Jul 11 10:20:44 2021 ./pm_11134904_log_progress.pl completed.

    Some other points regarding the specific code you posted:

    • You are opening and closing the logfile every time you call debug(). Unless you have a very good reason for doing otherwise, open once near the start of the program, pass a filehandle to debug(), close once near the end of the program.
    • The 'open ... or return 0;' made me cringe: that's no way to handle I/O problems. See open for many examples of how to do this; my code above has an example; also autodie is often a good alternative method (I use this a lot).
      • Use of a lexical filehandle and the 3-argument form of open is very good; it's just the error handling that has a problem.
    • I see 'return 0' used twice in debug(); however, although not shown, I would expect debug() is called in void context, i.e. debug(...) not my $debug_return_value = debug(...). Just use return; in any subroutine that's called in void context.

    — Ken

Re: Recommended way to display progress
by LanX (Saint) on Jul 10, 2021 at 22:58 UTC
    Changing a log file, especially timestamps, is against the idea of a log file.

    Either you only log start and end of your operation, or you additionally log - and keep - a handful of completed intervals in between.

    I'd say something like 10% steps. 10 extra lines is not expensive, but good enough to display progress.

    My 2cents, YMMV.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re: Recommended way to display progress
by AnomalousMonk (Archbishop) on Jul 11, 2021 at 00:14 UTC
    The operation takes some time so the whole step is long. I want to notify the user of the progress of this step.

    What is the user looking at? Is he or she looking at a terminal which is displaying some sort of tail of the log file?

    A better approach might be to just directly output to the terminal something like:

    my $max_width = ...; # longest possible progress message text my $percent_done = int(100*$handled_paths/@paths); my $progress = build_user_handhold_message(..., ..., $percent_done); printf "\r%-*s", $max_width, $progress;
    Of course, the latter three lines have to be done in some kind of loop while the process is... well, in process. (Update: And you'd better set $| (autoflush; see perlvar) true within the loop so something will actually print without newlines.)

    The \r in the printf format string is a carriage return: it returns the cursor to the start of the terminal line without doing a newline. The %-*s left-justifies the $progress text in a field of spaces $max_width wide (due to the * "next argument" specifier). IOW, you get a single line that refreshes over and over with updated progress and percent-done info.


    Give a man a fish:  <%-{-{-{-<

Re: Recommended way to display progress
by Fletch (Bishop) on Jul 11, 2021 at 01:24 UTC

    Something like Emoji Progress Spinners, or searching on cpan for "spinner" or "progress", would probably be a better approach.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

Re: Recommended way to display progress
by Marshall (Canon) on Jul 12, 2021 at 21:14 UTC
    I think we may have what is called an X-Y problem.
    Perhaps you are asking how to do X when you really need to do Y.

    In general a terminal I/F or GUI is much better at displaying progress than an operation using a log file.

    I absolutely hate the standard Windows "Circle of Death". This is often just a separate process that is just making a circle go around completely independent of whether the program is actually doing anything at all! That thing does have some value, but there is a user "patience factor".

    How often does the user need a progress report?:
    In my experience (yours may vary), the user will tolerate about a max of 60 to perhaps 90 seconds before thinking that the process is hung and "not doing anything" and that is even with the "warning" that this utility "can take a long time".

    A main thing that you want to prevent is the user deciding that the program is "hung", and therefore "killing it" prematurely while it is in fact actually continuing to its job just fine!.

    If you can provide a command line I/F, then think about just printing a simple "." per "unit" worked. Don't print dots via separate process (ala the "Windows Circle of Death"). If each dot represents actual progress towards the goal, the user will accept this. If a dot (period) appears at least every 30 seconds, that is often enough. It is completely fine if the appearance of the dots has an irregular rhythm. Humans are "pattern matching machines". The human (if he/she has used this utility before), will recognize a "good" vs "bad" pattern.

    NEVER modify a log file. That is a very bad idea as other Monks have said.

    I suggest that you don't have to get "fancy", just print a dot (period) on the command line I/F that is related to "actual work accomplished" and do that at least every 30 seconds and you will do just fine. The user will believe that the program is still running and won't "kill it".