toadi has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I use a dial-up and I wanted to keep track of my internet-costs.
I wrote a script that calculates my costs, but now I wanted to know if there's a efficient way to parse the log(I don't really want to show my solution because it stinks).

The log file is like:

10:09:55 05/28/00 Sunday 10:37:45 05/28/00 Sunday 09:27:02 05/29/00 Monday 09:34:05 05/29/00 Monday 20:48:00 05/29/00 Monday 20:57:19 05/29/00 Monday 21:38:32 05/29/00 Monday 21:40:14 05/29/00 Monday 09:32:50 05/30/00 Tuesday 09:37:29 05/30/00 Tuesday 13:03:23 05/30/00 Tuesday 13:09:45 05/30/00 Tuesday 13:19:17 05/30/00 Tuesday
the first-line is the start-time,the second-line the stop time. So it goes on and on. It's tab seperated. I used split to get the hour, date and day in seperated variables.
But from then I used an *ugly* onorthodox way to calculate the amount of time I went online.

Any ideas?

Replies are listed 'Best First'.
Re: parsing
by btrott (Parson) on Jul 27, 2000 at 00:47 UTC
    Something like this might do the trick:
    use Date::Manip; my $total = ParseDateDelta(""); while (<>) { chomp; my $start = $_; chomp(my $end = <>); ## Build date/time stamps for start and end. $start = join ' ', (split /\t/, $start)[1,0]; $end = join ' ', (split /\t/, $end)[1,0]; ## Add delta of start and end stamps to ## total delta. $total = DateCalc( DateCalc($start, $end), $total); } print "Total hours online: ", Delta_Format($total, 0, "%hd"), "\n";
    It uses Date::Manip quite liberally, as you can tell. :)
Re: parsing
by cwest (Friar) on Jul 27, 2000 at 01:02 UTC
    #!/usr/bin/perl -w use strict; $|++; use Time::Local; my $online = [ map { timelocal( (split /[:\/\s]+/)[2,1,0,4,3,5] ) } <DA +TA> ]; my $total = 0; for ( local $_ = 0; $_ <= $#{$online}; $_ += 2 ) { $total += $online->[$_+1] - $online->[$_]; } print "$total seconds online\n" __DATA__ 10:09:55 05/28/00 Sunday 10:37:45 05/28/00 Sunday 09:27:02 05/29/00 Monday 09:34:05 05/29/00 Monday 20:48:00 05/29/00 Monday 20:57:19 05/29/00 Monday 21:38:32 05/29/00 Monday 21:40:14 05/29/00 Monday 09:32:50 05/30/00 Tuesday 09:37:29 05/30/00 Tuesday 13:03:23 05/30/00 Tuesday 13:09:45 05/30/00 Tuesday
    Enjoy!
    --
    Casey
    
      I never used map before, I read the docs but don't really understand it. Can you make it a bit more clear what it just does?

      --
      My opinions may have changed,
      but not the fact that I am right

        The easiest way to think of map is to reduce it to something a bit more familiar.
        @result = map SOME_EXPRESSION, @input;
        can be replaced with:
        @TEMP = (); foreach $_ (@input) { push @TEMP, SOME_EXPRESSION; } @result = @TEMP;
        In other words, evaluate SOME_EXPRESSION for each @input item, and take the result(s) of that, concatenated together, to make an output. While the expression is being evaluated, $_ contains the "current item" from @input.

        -- Randal L. Schwartz, Perl hacker

        This should probably be a Q&A but, "I understand your pain" in regards to map. Up until _very_ recently I didn't really understand it either (probably still don't -not like the true monks) but here goes:

        Map executes a block of code for each element in a list and returns a list of the results. The magical variable $_ is set (actually localized so it isn't clobbered outside the block) to each element within the block. Map is _extremely_ powerful when in the hands of the likes of merlyn and friends. (i.e. Schwartzian Transform)
        A couple (simple) examples to illustrate:
        copy a list (i.e. do nothing same as @b = @a;)

        @b = map {$_} @a;

        Take a list of numbers (a vector) square them, add them together and take the square root (Euclidian N-dimensional distance):

         $dist = sqrt(eval join("+",map {$_**2} @a));

        Explanation of that last one: map squares each element in @a (say 1,2,3,4,5) and returns a new list (1,4,9,16,25) then join makes a scalar "1+4+9+16+25", the eval makes it 55 and finally sqrt returns 7.41619... which is assigned to $dist.

        For another example see A minor epiphany with grep and map as well as every piece of documentation on the topic you can get your hands on. I think what did it for me (finally) was Effective Perl Programming by Joseph N. Hall with Randal L. Schwartz. The section (Item 12) is only 4 pages long and also covers grep and foreach but, it was the lightswitch for me.
        -ase

(Adam: Use a better log format) RE: parsing
by Adam (Vicar) on Jul 27, 2000 at 00:49 UTC
    Go back to the script that generates the log in the first place and store the epoch instead of the data you have. (or in addition to this stuff if you like to read it)
    Then you can just subtract one epoch from the other to get total number of seconds. (The epoch is returned from time() )

    So, for example, if your log now reads:

    964644652 Wed Jul 26 13:50:52 2000 964644691 Wed Jul 26 13:51:31 2000
    You can parse this like so:
    use strict; # Always. my $logname = "log.txt"; open LOG, $logname or die "Failed to open $logname, $!"; my( $start, $stop ); while( $start = <LOG> ) { ($start) = split /\s/, $start; $stop = <LOG> or die "No End Time in LOG!\n"; ($stop) = split /\s/, $stop; my $time_in_seconds = $stop - $start; # Do something with $time_in_seconds print "$time_in_seconds\n"; } close LOG or die "Failed to close $logname, $!";
    To get the log format I gave you I used:
    use strict; my $t = time(); print $t, "\t", scalar localtime($t)";

    Update: Tye Pointed out that split will take care of the remainder for me, so I took that out.

    Also, you might want to note in the log which lines are 'start' and which are 'stop' so that you can catch errors (like failing to record the 'stop' time). Right now, you would only notice such an error if there is an odd number of occurances. (And only at the end of the script when it can't find an end time for the last entry.) Of course, then you would need to parse that extra bit of info from the log. But entries like:

    START 964644691 Wed Jul 26 13:51:31 2000 STOP 964644691 Wed Jul 26 13:51:31 2000
    should not be hard to parse. just call split /\s/ as before, and make the recieving array ($tag, $epoch) or whatever.
      Actually I'm working in linux and I use the ip-up and ip-down script to write the start and stop time to this log.
      --
      My opinions may have changed,
      but not the fact that I am right

Re: parsing
by davorg (Chancellor) on Jul 27, 2000 at 00:48 UTC

    Once you've split the date and time into bits, you can pass the various parts into timelocal to get the number of seconds since the epoch for each start and end time. Subracting start time from end time will give you the nubmer of seconds that you spent online.

    Note: When using timelocal don't forget that it requires values in the same form that locatime produces, i.e. year is years since 1900 and month is zero-based.

    --
    <http://www.dave.org.uk>

    European Perl Conference - Sept 22/24 2000, ICA, London
    <http://www.yapc.org/Europe/>
Re: parsing
by jettero (Monsignor) on Jul 27, 2000 at 00:50 UTC
    I came up this this:
    #!/usr/bin/perl use strict; use Date::Calc qw /Delta_DHMS/; my $total; my $d; open in, "log"; while($d = <in>) { $d =~ m/(\d+):(\d+):(\d+)\s+(\d+)\/(\d+)\/(\d+)/; my @d1 = (2000 + $6, $4, $5, $1, $2, $3); $d = <in>; $d =~ m/(\d+):(\d+):(\d+)\s+(\d+)\/(\d+)\/(\d+)/; my @d2 = (2000 + $6, $4, $5, $1, $2, $3); my ($days, $hours, $minutes, $seconds) = Delta_DHMS( @d1, @d2 ); $total += ($days * 86400 + $hours * 3600 + $minutes * 60 + $seconds ); } close in; print "Total seconds: $total\n";
Re: parsing
by le (Friar) on Jul 27, 2000 at 00:28 UTC
    My very first post to Perlmonks was PTkPPP. Maybe you find some useful snippets in there. It's a modem dialer which logs and calculates costs.
Re: parsing
by jeorgen (Pilgrim) on Jul 28, 2000 at 00:07 UTC
    You can use the Time::Parsedate module in the Time-modules distribution.
    It will parse a variety of date and time formats into seconds automagically; your formats should be accepted directly without any special configuration parameters.

    /jeorgen