The reason I am going to try spliting the script across 2 machines is because I was thinking the same thing as Perlbotics above. There is just too much load on the resources if I try to do everything at the same time. By splitting the global task in 2 parts I can either run both concurrently on 2 machines or in sequence on 1, depending on my time constraints.
But just in case, here is the rest of the code you were asking for. Am I doing something horribly wrong there that is causing needless load on the CPU?
The Load data function just executes this sql:
my $sql = qq{ LOAD DATA LOCAL INFILE \"$mysql_tmp_filename\"
INTO TABLE `$tblname`
FIELDS TERMINATED BY ','
OPTIONALLY enclosed by '\"'
ESCAPED BY '\\\\'
LINES TERMINATED BY '\\r\\n'
IGNORE 3 LINES
(file_name,\@file_p,file_size,\@file_la,\@file_lc,\@file_c, file_exten
+sion)
set file_last_access = STR_TO_DATE(\@file_la, '%c/%e/%Y %l:%i %p'),
file_path = \@file_p,
file_share_name = substring_index(substring_index(\@file_p,'\\\\',3),'
+\\\\',-1),
file_last_change = STR_TO_DATE(\@file_lc, '%c/%e/%Y %l:%i %p'),
file_creation = STR_TO_DATE(\@file_c, '%c/%e/%Y %l:%i %p')
};
my $sth = $dbh->prepare($sql);
$sth->execute();
and here is fix_file_4mysql:
sub fix_file_4mysql { #arguments: 1) path to file
#returns: 1) the filename fixed for use in a mysql query
#fix the contents of the temp file for use in mysql
my $tmp_filename = shift;
open( IN, "+<$tmp_filename" );
my @file = <IN>;
seek IN, 0, 0;
foreach my $file (@file) {
$file =~ s/\\/\\\\/g;
print IN $file;
}
close IN;
#fix temp_filename for use in mysql
my $mysql_tmp_filename = $tmp_filename;
$mysql_tmp_filename =~ s/\\/\\\\/g;
return $mysql_tmp_filename;
}
|