comment on

I think that this consolidating and splitting logs method is very inefficient but i think that i could help just by cleaning the code to some degree....

#!/usr/bin/perl

use strict;
use warnings;

#-------------------------------------------------------

sub generate_date_str ($) {
    my ($time) = @_;

    my ($mday,$mon,$year) = (localtime($time))[3,4,5];
    $year += 1900;
    $mon++;

    return sprintf("%04d%02d%02d", $year, $mon, $mday);
}

#-------------------------------------------------------

sub get_matching_filenames ($$) {
    my ($dir, $match_str) = @_;

    opendir(DIR, $dir) or die "couldn't open directory \"$dir\"";
    my @names = grep {/$match_str/} readdir(DIR);
    closedir(DIR);

    return @names;
}

#-------------------------------------------------------

sub consolidate_logs ($$$) {
    my ($destination_file, $dir, $filename_str) = @_;

    my @files = get_matching_filenames($dir, $filename_str);

    open(OUT,"> $destination_file") or die "Could not open file \"$des
+tination_file\" for writing";

    foreach my $source_file (@files) {
        print "Processing of log \"$source_file\" started at " . local
+time() . "\n";

        open(OLD,"< $dir/$source_file") or die("Could not open file \"
+$dir/$source_file\" for reading");
        while (<OLD>) {
            print OUT $_;
        }
        close(OLD);

        print "Processing of log \"$source_file\" ended at " . localti
+me() . ".\n";
    }

    close(OUT);
}

#-------------------------------------------------------

sub split_logs ($$$) {
    my ($source_file, $business_list, $filename_prefix) = @_;

    foreach my $business (@$business_list) {
        my ($domain, $file) = @$business;

        my $outfile = "/inside29/urchin/test/newfeed/$filename_prefix-
+$file";
        my $newigebusiness = $domain;

        print  "Creating of log for $newigebusiness started at " . loc
+altime() . "\n";

        open(OUT,">> $outfile") || die("Could not open out file \"$out
+file\" for appending");
        open(OLD,"< $source_file") || die ("Could not open the consoli
+dated file \"$source_file\" for reading");
        while (<OLD>) {
            if ((index($_,$newigebusiness))> -1) { 
                print OUT $_;
            }
        }
        close(OLD);
        close(OUT);

        print "Log for $newigebusiness created at " . localtime() . "\
+n";
    }
}

#-------------------------------------------------------

my @businesses = (
 [ "\"corp.home.ge.com\"", "new_corp_home_ge_com.log" ],
 [ "\"scotland.gcf.home.ge.com\", "new_scotland_gcf_home_ge_com.log" ]
+,
 [ "\"marketing.ge.com\"", "new_marketing_ge_com.log" ]
);

my $consolidated_log = "consolidatedlog.txt";
my $logfiles_dir = '/inside29/urchin/test/logfiles';

my $today = generate_date_str( time() );
my $yesterday = generate_date_str( time() - (24 * 60 * 60) );

consolidate_logs($consolidated_log, $logfiles_dir, $yesterday);

split_logs($consolidated_log, \@businesses, $today);
[download]

hope that helps.
bartek

In reply to Re: Reduce the time taken for Huge Log files by Anonymous Monk
in thread Reduce the time taken for Huge Log files by pr19939

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.