Maire has asked for the wisdom of the Perl Monks concerning the following question:

Good afternoon monks,

I have a very large .txt file containing approximately 800,000 lines of text. Each line, at some point, contains at least two timestamps which are currently in Unix time. Below are examples of two lines from my data:

Misc data that is unimportant:false,true,false,"retrieved_on":14335074 +59,more misc data,nul;,true,"created_utc":"1433122764", "retrieved_on":1433507472,misc data,"created_utc":"1433122764",true,fa +lse,more misc data

I am trying to convert all of the Unix timestamps to a more traditional date and time format using the following code:

use strict; use utf8; open(FILE, '<C:/Users/li/perl/part1.txt') || die "File not found"; my @lines = <FILE>; close(FILE); my @newlines; foreach(@lines) { $_ =~ s/([\d]{10})/localtime $1/eg; push(@newlines,$_); } open(FILE, '>C:/Users/li/perl/part1_CORRECTED.txt') || die "File not f +ound"; print FILE @newlines; close(FILE);

However, while this code does, technically work, it returns the date and time in a strange format which is not helpful to when my analysing my data further:

Misc data that is unimportant:false,true,false,"retrieved_on":Fri Jun + 5 13:30:59 2015,more misc data,nul;,true,"created_utc":"Mon Jun 1 0 +2:39:24 2015", "retrieved_on":Fri Jun 5 13:31:12 2015,misc data,"created_utc":"Mon J +un 1 02:39:24 2015",true,false,more misc data

Is there a way that I can modify my code to return the output in a more intuitive style (e.g. maybe YYYY:MM:DD HH:MM:SS or similar)?

Thank you in advance!

Replies are listed 'Best First'.
Re: Converting multiple unix timestamps
by Tux (Canon) on Jul 27, 2018 at 14:40 UTC
    use strict; use warnings; sub date { my @d = localtime (shift || time); sprintf "%4d-%02d-%02d %02d:%02d:%02d", $d[5] + 1900, ++$d[4], @d[ +3,2,1,0]; } # date open my $fi, "<", "C:/Users/li/perl/part1.txt" or die "File not found" +; open my $fo, ">", "C:/Users/li/perl/part1_CORRECTED.txt" or die "Canno +t create output: $!"; while (<$fi>) { s/\b([0-9]{10})\b/date ($1)/ge; print $fo $_; } close $fi or die $!; close $fo or die $!;

    Enjoy, Have FUN! H.Merijn
      Thank you for this!
Re: Converting multiple unix timestamps
by hippo (Archbishop) on Jul 27, 2018 at 14:34 UTC

    TIMTOWTDI but using the core Time::Piece module makes it pretty trivial:

    #!/usr/bin/env perl use strict; use warnings; use Time::Piece; my @lines = ( qq{Misc data that is unimportant:false,true,false,"retrieved_on":14335 +07459,more misc data,nul;,true,"created_utc":"1433122764",\n}, qq{"retrieved_on":1433507472,misc data,"created_utc":"1433122764",true +,false,more misc data\n} ); foreach (@lines) { $_ =~ s/([\d]{10})/(localtime $1)->datetime/eg; print; }
      Brilliant, thanks!
Re: Converting multiple unix timestamps
by Laurent_R (Canon) on Jul 27, 2018 at 19:19 UTC
    Your use of the localtime function was a good idea, except that you would need to use it in list context instead of scalar context.

    In scalar context it returns a formatted date string:

    $ perl -e 'my $timestamp = localtime time; print "$timestamp\n"' Fri Jul 27 21:08:16 2018
    In list context, it returns an array of nine values defining the date and time (lookup the documentation to find out what they are):
    $ perl -e 'my @timefields = localtime time; print join " ", @timefiel +ds;' 23 8 21 27 6 118 5 207 1
    The first three fields say that I ran the command at 21:08:23 (local time), and the following 3 ones represent the date, etc. (but the date requires some calculations: 6 stands for July, because January is 0, and you need to add 1900 to 118 to get the year).

    So you can do your own calculations as needed, but it is often better to use modules such as Time::Piece already suggested by hippo or some other modules (there are many of them).

      Ah, thank you. That makes a lot of sense, thanks again!
Re: Converting multiple unix timestamps -- oneliner
by Discipulus (Canon) on Jul 28, 2018 at 10:44 UTC
    Hello Maire

    Condensing what you already have, a short oneliner makes the job done:

    perl -pe "s/\b([0-9]{10})\b/localtime($1)/ge" file_with_timestamps.txt + > file_with_dates.txt

    Generally I always use scalar localtime (time) but here there is no need. The above is already quoted for windows, where it seems you are (doublequotes for the oneliner).

    L*

    PS I had a mistake in the above oneliner perl -pne instead of perl -pe it worked the same way ( p overtake n ?). Npw corrected.

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      Hi Discipulus,

      I'm afraid you misread the question.

      It seems to me that your code will replace time stamps with date strings such as "Sat Jul 28 20:11:27 2018", but that's not what is desired: Maire wants something like "YYYY:MM:DD HH:MM:SS or similar."