in reply to optimize the code

Untested, but I think this should run substantially more quickly.

The basic idea is instead of converting all 11 million GMT dates to match your PST target date, you convert the target date to GMT and use a simple regex to do the matching:

my $target = UnixDate( Date_ConvTZ( ParseDate("3 days ago"), 'PST', 'GMT' ) ,"%e/%h/%Y" ); open DATA,">$ARGV[1]"; open FH,"$ARGV[0]"; m/\[$target:/ and print DATA $_ while <FH>; close DATA; close FH;

If there might be other dates embedded in the log that would be matched by the regex [...:, then you might need to elaborate the regex to isolate the required date.

Alternatively, if as your sample suggests the required date is at a set offset from the start of the line, you might use:

substr( $_, 34, 11 ) eq $target and print DATA $_ while <FH>

which as a straight string compare would be even quicker.

This assumes that your "3 days earlier" runs midnight to midnight GMT on that day. If you need to cater for the timezone shift of the start and end of day, then things get more complicated. But your code doesn't appear to be doing that.

In that case I probably calculate the unixtime (seconds since epoch) of the start and end times, convert the log date/times to the same and use a numeric compare:

print if $logSecs > $startSecs && $logSecs < $endSecs;

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

Replies are listed 'Best First'.
Re^2: optimize the code
by Anonymous Monk on Jun 25, 2010 at 05:18 UTC
    This is the input file, and it has combination of GMT dates.
    74.13.151.1 - - [22/Jun/2010:06:00:00 +0000] GET 67.195.112.248 - - [21/Jun/2010:20:09:42 +0000] GET 99.138.106.5 - - [21/Jun/2010:23:10:18 +0000] GET 99.138.106.5 - - [21/Jun/2010:09:10:18 +0000] GET
    When I run the script using the code below.
    #!/usr/bin/perl use strict; use warnings; use Date::Manip; my $date_converted = UnixDate(ParseDate("3 days ago"),"%e/%h/%Y"); open DATA,">$ARGV[1]"; open FH,"$ARGV[0]"; while(<FH>){ my @tab_delimited_array = split(/\t/,$_); $tab_delimited_array[3] =~ s/^\[//; $tab_delimited_array[3] =~ s/^\-//; chomp($tab_delimited_array[3]); if(length($tab_delimited_array[3]) > 1) { my $date_format = UnixDate($tab_delimited_array[3],"%Y%m%d%H:%M:%S +"); my $converted_date = Date_ConvTZ("$date_format",'GMT','PST'); my $pst_converted_date = UnixDate($converted_date,"%e/%h/%Y:%H:%M: +%S"); $pst_converted_date =~ s/^\s//g; my $extracted_YMD=UnixDate($converted_date,"%e/%h/%Y"); $_ =~ s/$tab_delimited_array[3]/$pst_converted_date/g; if($extracted_YMD =~ m/$date_converted/){ print DATA $_; } } } close DATA; close FH;
    output is
    74.13.151.1 - - [21/Jun/2010:22:00:00 +0000] GET 67.195.112.248 - - [21/Jun/2010:12:09:42 +0000] GET 99.138.106.5 - - [21/Jun/2010:15:10:18 +0000] GET
    When I use the code,it is matching the input file for just 3 days ago.
    my $target = UnixDate(Date_ConvTZ( ParseDate("3 days ago"), 'GMT', 'PS +T' ),"%e/%h/%Y"); print $target; open DATA,">$ARGV[1]"; open FH,"$ARGV[0]"; m/\[$target:/ and print DATA $_ while <FH>; close DATA; close FH;
    Output is:
    67.195.112.248 - - [21/Jun/2010:20:09:42 +0000] GET 99.138.106.5 - - [21/Jun/2010:23:10:18 +0000] GET
    please tell me how to optimize the code to read the date/time from input file and convert to PST time.
    while(<FH>){ my @tab_delimited_array = split(/\t/,$_); $tab_delimited_array[3] =~ s/^\[//; $tab_delimited_array[3] =~ s/^\-//; chomp($tab_delimited_array[3]); if(length($tab_delimited_array[3]) > 1) { my $date_format = UnixDate($tab_delimited_array[3],"%Y%m%d%H:%M:%S +"); my $converted_date = Date_ConvTZ("$date_format",'GMT','PST'); my $pst_converted_date = UnixDate($converted_date,"%e/%h/%Y:%H:%M: +%S"); $pst_converted_date =~ s/^\s//g; my $extracted_YMD=UnixDate($converted_date,"%e/%h/%Y"); $_ =~ s/$tab_delimited_array[3]/$pst_converted_date/g; if($extracted_YMD =~ m/$date_converted/){ print DATA $_; } } }
      Please help me on this. It takes long time to read the input file,convert into pst format and match for 3 days ago date.

        If you've tried all of the advice you've already been given regarding optimization, benchmarking and profiling and your code isn't fast enough, get a faster computer with faster disks and memory.