ukhare has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file which consists some data of 24 hours time stamp segregated in 10 minutes interval.

2016-02-06,00:00:00,ujjawal,36072-2,MT,37,0,1 2016-02-06,00:10:00,ujjawal,36072-2,MT,37,0,1 2016-02-06,00:20:00,ujjawal,36072-2,MT,37,0,1 2016-02-06,00:40:00,ujjawal,36072-2,MT,37,0,1 2016-02-06,00:50:00,ujjawal,36072-2,MT,42,0,2 2016-02-06,01:00:00,ujjawal,36072-2,MT,55,0,2 2016-02-06,01:10:00,ujjawal,36072-2,MT,41,0,2 2016-02-06,01:20:00,ujjawal,36072-2,MT,46,0,2 2016-02-06,01:30:00,ujjawal,36072-2,MT,56,0,3 2016-02-06,01:40:00,ujjawal,36072-2,MT,38,0,2 2016-02-06,01:50:00,ujjawal,36072-2,MT,49,0,2 2016-02-06,02:00:00,ujjawal,36072-2,MT,58,0,4 2016-02-06,02:10:00,ujjawal,36072-2,MT,43,0,2 2016-02-06,02:20:00,ujjawal,36072-2,MT,46,0,2 2016-02-06,02:30:00,ujjawal,36072-2,MT,61,0,2 2016-02-06,02:40:00,ujjawal,36072-2,MT,57,0,3 2016-02-06,02:50:00,ujjawal,36072-2,MT,45,0,2 2016-02-06,03:00:00,ujjawal,36072-2,MT,45,0,3 2016-02-06,03:10:00,ujjawal,36072-2,MT,51,0,2 2016-02-06,03:20:00,ujjawal,36072-2,MT,68,0,3 2016-02-06,03:30:00,ujjawal,36072-2,MT,51,0,2 2016-02-06,03:40:00,ujjawal,36072-2,MT,68,0,3 2016-02-06,03:50:00,ujjawal,36072-2,MT,67,0,3 2016-02-06,04:00:00,ujjawal,36072-2,MT,82,0,8 2016-02-06,04:10:00,ujjawal,36072-2,MT,82,0,5 2016-02-06,04:20:00,ujjawal,36072-2,MT,122,0,4 2016-02-06,04:30:00,ujjawal,36072-2,MT,133,0,3 2016-02-06,04:40:00,ujjawal,36072-2,MT,142,0,3 2016-02-06,04:50:00,ujjawal,36072-2,MT,202,0,1 2016-02-06,05:00:00,ujjawal,36072-2,MT,731,1,3 2016-02-06,05:10:00,ujjawal,36072-2,MT,372,0,7 2016-02-06,05:20:00,ujjawal,36072-2,MT,303,0,2 2016-02-06,05:30:00,ujjawal,36072-2,MT,389,0,3 2016-02-06,05:40:00,ujjawal,36072-2,MT,454,0,1 2016-02-06,05:50:00,ujjawal,36072-2,MT,406,0,6 2016-02-06,06:00:00,ujjawal,36072-2,MT,377,0,1 2016-02-06,06:10:00,ujjawal,36072-2,MT,343,0,5 2016-02-06,06:20:00,ujjawal,36072-2,MT,370,0,2 2016-02-06,06:30:00,ujjawal,36072-2,MT,343,0,9 2016-02-06,06:40:00,ujjawal,36072-2,MT,315,0,8 2016-02-06,06:50:00,ujjawal,36072-2,MT,458,0,3 2016-02-06,07:00:00,ujjawal,36072-2,MT,756,1,3 2016-02-06,07:10:00,ujjawal,36072-2,MT,913,1,3 2016-02-06,07:20:00,ujjawal,36072-2,MT,522,0,3 2016-02-06,07:30:00,ujjawal,36072-2,MT,350,0,7 2016-02-06,07:40:00,ujjawal,36072-2,MT,328,0,6 2016-02-06,07:50:00,ujjawal,36072-2,MT,775,1,3 2016-02-06,08:00:00,ujjawal,36072-2,MT,310,0,9 2016-02-06,08:10:00,ujjawal,36072-2,MT,308,0,6 2016-02-06,08:20:00,ujjawal,36072-2,MT,738,1,3 2016-02-06,08:30:00,ujjawal,36072-2,MT,294,0,6 2016-02-06,08:40:00,ujjawal,36072-2,MT,345,0,1 2016-02-06,08:50:00,ujjawal,36072-2,MT,367,0,6 2016-02-06,09:00:00,ujjawal,36072-2,MT,480,0,3 2016-02-06,09:10:00,ujjawal,36072-2,MT,390,0,3 2016-02-06,09:20:00,ujjawal,36072-2,MT,436,0,3 2016-02-06,09:30:00,ujjawal,36072-2,MT,1404,2,3 2016-02-06,09:40:00,ujjawal,36072-2,MT,346,0,3 2016-02-06,09:50:00,ujjawal,36072-2,MT,388,0,3 2016-02-06,10:00:00,ujjawal,36072-2,MT,456,0,2 2016-02-06,10:10:00,ujjawal,36072-2,MT,273,0,7 2016-02-06,10:20:00,ujjawal,36072-2,MT,310,0,3 2016-02-06,10:30:00,ujjawal,36072-2,MT,256,0,7 2016-02-06,10:40:00,ujjawal,36072-2,MT,283,0,3 2016-02-06,10:50:00,ujjawal,36072-2,MT,276,0,3 2016-02-06,11:00:00,ujjawal,36072-2,MT,305,0,1 2016-02-06,11:10:00,ujjawal,36072-2,MT,310,0,9 2016-02-06,11:20:00,ujjawal,36072-2,MT,286,0,3 2016-02-06,11:30:00,ujjawal,36072-2,MT,286,0,3 2016-02-06,11:40:00,ujjawal,36072-2,MT,247,0,7 2016-02-06,11:50:00,ujjawal,36072-2,MT,366,0,2 2016-02-06,12:00:00,ujjawal,36072-2,MT,294,0,2 2016-02-06,12:10:00,ujjawal,36072-2,MT,216,0,5 2016-02-06,12:20:00,ujjawal,36072-2,MT,233,0,1 2016-02-06,12:30:00,ujjawal,36072-2,MT,785,1,2 2016-02-06,12:40:00,ujjawal,36072-2,MT,466,0,1 2016-02-06,12:50:00,ujjawal,36072-2,MT,219,0,9 2016-02-06,13:00:00,ujjawal,36072-2,MT,248,0,6 2016-02-06,13:10:00,ujjawal,36072-2,MT,223,0,7 2016-02-06,13:20:00,ujjawal,36072-2,MT,276,0,8 2016-02-06,13:30:00,ujjawal,36072-2,MT,219,0,6 2016-02-06,13:40:00,ujjawal,36072-2,MT,699,1,2 2016-02-06,13:50:00,ujjawal,36072-2,MT,439,0,2 2016-02-06,14:00:00,ujjawal,36072-2,MT,1752,2,3 2016-02-06,14:10:00,ujjawal,36072-2,MT,203,0,5 2016-02-06,14:20:00,ujjawal,36072-2,MT,230,0,7 2016-02-06,14:30:00,ujjawal,36072-2,MT,226,0,1 2016-02-06,14:40:00,ujjawal,36072-2,MT,195,0,6 2016-02-06,14:50:00,ujjawal,36072-2,MT,314,0,1 2016-02-06,15:00:00,ujjawal,36072-2,MT,357,0,2 2016-02-06,15:10:00,ujjawal,36072-2,MT,387,0,9 2016-02-06,15:20:00,ujjawal,36072-2,MT,1084,1,3 2016-02-06,15:30:00,ujjawal,36072-2,MT,1295,2,3 2016-02-06,15:40:00,ujjawal,36072-2,MT,223,0,8 2016-02-06,15:50:00,ujjawal,36072-2,MT,254,0,1 2016-02-06,16:00:00,ujjawal,36072-2,MT,252,0,7 2016-02-06,16:10:00,ujjawal,36072-2,MT,268,0,1 2016-02-06,16:20:00,ujjawal,36072-2,MT,242,0,1 2016-02-06,16:30:00,ujjawal,36072-2,MT,254,0,9 2016-02-06,16:40:00,ujjawal,36072-2,MT,271,0,3 2016-02-06,16:50:00,ujjawal,36072-2,MT,244,0,7 2016-02-06,17:00:00,ujjawal,36072-2,MT,281,0,1 2016-02-06,17:10:00,ujjawal,36072-2,MT,190,0,8 2016-02-06,17:20:00,ujjawal,36072-2,MT,187,0,1 2016-02-06,17:30:00,ujjawal,36072-2,MT,173,0,9 2016-02-06,17:40:00,ujjawal,36072-2,MT,140,0,5 2016-02-06,17:50:00,ujjawal,36072-2,MT,147,0,6 2016-02-06,18:00:00,ujjawal,36072-2,MT,109,0,4 2016-02-06,18:10:00,ujjawal,36072-2,MT,99,0,1 2016-02-06,18:20:00,ujjawal,36072-2,MT,66,0,6 2016-02-06,18:30:00,ujjawal,36072-2,MT,67,0,4 2016-02-06,18:40:00,ujjawal,36072-2,MT,40,0,2 2016-02-06,18:50:00,ujjawal,36072-2,MT,52,0,3 2016-02-06,19:00:00,ujjawal,36072-2,MT,40,0,3 2016-02-06,19:10:00,ujjawal,36072-2,MT,30,0,2 2016-02-06,19:20:00,ujjawal,36072-2,MT,25,0,3 2016-02-06,19:30:00,ujjawal,36072-2,MT,35,0,4 2016-02-06,19:40:00,ujjawal,36072-2,MT,14,0,1 2016-02-06,19:50:00,ujjawal,36072-2,MT,97,0,7 2016-02-06,20:00:00,ujjawal,36072-2,MT,14,0,1 2016-02-06,20:10:00,ujjawal,36072-2,MT,12,0,4 2016-02-06,20:20:00,ujjawal,36072-2,MT,11,0,2 2016-02-06,20:30:00,ujjawal,36072-2,MT,12,0,1 2016-02-06,20:40:00,ujjawal,36072-2,MT,6,0,1 2016-02-06,20:50:00,ujjawal,36072-2,MT,13,0,2 2016-02-06,21:00:00,ujjawal,36072-2,MT,5,0,1 2016-02-06,21:10:00,ujjawal,36072-2,MT,12,0,2 2016-02-06,21:20:00,ujjawal,36072-2,MT,1,0,1 2016-02-06,21:30:00,ujjawal,36072-2,MT,21,0,2 2016-02-06,21:50:00,ujjawal,36072-2,MT,9,0,3 2016-02-06,22:00:00,ujjawal,36072-2,MT,2,0,1 2016-02-06,22:10:00,ujjawal,36072-2,MT,12,0,5 2016-02-06,22:20:00,ujjawal,36072-2,MT,1,0,1 2016-02-06,22:30:00,ujjawal,36072-2,MT,9,0,1 2016-02-06,22:40:00,ujjawal,36072-2,MT,13,0,1 2016-02-06,23:00:00,ujjawal,36072-2,MT,20,0,2 2016-02-06,23:10:00,ujjawal,36072-2,MT,10,0,3 2016-02-06,23:20:00,ujjawal,36072-2,MT,10,0,1 2016-02-06,23:30:00,ujjawal,36072-2,MT,6,0,1 2016-02-06,23:40:00,ujjawal,36072-2,MT,12,0,1

if you see above sample as per 10 minutes interval there should be total 143 rows in 24 hours in this file but after second last line which has time 2016-02-06,23:40:00 data for date, time 2016-02-06,23:50:00 is missing.

similarly after 2016-02-06,22:40:00 data for date, time 2016-02-06,22:50:00 is missing.

can we insert missing date,time followed by 6 null separated by commas e.g. 2016-02-06,22:50:00,null,null,null,null,null,null where ever any data missing in rows of this file based on count no 143 rows and time stamp comparison in rows 2016-02-06,00:00:00 to 2016-02-06,23:50:00 which is also 143 in count ?

Replies are listed 'Best First'.
Re: insert null in missing rows
by Athanasius (Archbishop) on Mar 08, 2016 at 07:26 UTC

    Hello ukhare,

    What have you tried so far, and how has it fallen short of your requirements?

    I think the basic algorithm is quite straightforward:

    1. Keeping track of the next expected timestamp, read through the file line by line:
    2. When the timestamp read is as expected, save the current line (unchanged) in an array;
    3. When the timestamp is not as expected, construct the missing line from the current date, the expected timestamp, and 6 nulls, and add it onto the array.
    4. After each line, update the expected timestamp by 10 minutes.
    5. When finished, rename the input file as a backup,
    6. and then overwrite the input file with the array of lines accumulated in steps 1 to 4 above.

    The only tricky part is incrementing the timestamp. Here is one approach:

    sub inc { my ($next) = @_; my ($hours, $mins, $secs) = split /:/, $next; $mins += 10; if ($mins >= 60) { ++$hours; $mins -= 60; } return sprintf '%02d:%02d:%02d', $hours, $mins, $secs; }

    which you would call like this:

    $next = inc($next);

    Update: Changed $mins = 0; to $mins -= 60; to allow for the possibility that minutes are not multiples of 10.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: insert null in missing rows
by FreeBeerReekingMonk (Deacon) on Mar 08, 2016 at 08:54 UTC
    Another approach could be preallocating hashed of your timedate, then looping over your file, and write the data out. Maybe leave out the seconds, as that might sometimes not be :00 depending on the load of the system

    This code might help to loop, and prepend 0 to numbers smaller than 2 digits:

    # First, we read the file, put the data into a hash # now, we loop on the day: $day = '2016-02-06'; for $hour (00..23){ for($min=00;$min<60; $min+=10){ $key = sprintf("%s,%02d:%02d", $day,$hour,$min); print "Checking if $key:00 exists\n"; } }

    ps: The real world solution is using a RRD (round robin database)

Re: insert null in missing rows
by GotToBTru (Prior) on Mar 08, 2016 at 13:14 UTC

    Don't forget to test to see if the very first record, 00:00:00, is missing, too!

    But God demonstrates His own love toward us, in that while we were yet sinners, Christ died for us. Romans 5:8 (NASB)

A reply falls below the community's threshold of quality. You may see it by logging in.