in reply to Re^2: <speed a major issue with below code also loosing data while writing in files>
in thread <speed a major issue with below code also loosing data while writing in files>

This code:

foreach $uniquekey1 (keys %myHash1) { $uniq2 = $file_name1.'.csv'; sysopen($CircleGroupHandle1,"$ARGV[0]/$uniq2",O_WRONLY| +O_APPEND|O_CREAT)or die "Error writing to $!";

If you look closely the file name is composed of: "$ARGV[0]/$file_name1.csv", all of the variables involved are defined outside the foreach loop, if that doesn't change that means that within the loop you're opening the same file over and over.

The first optimization would be to extra the constants from the loop.< That means that $file_name and the file_handle can be outside the foreach loop. This will prevent your program to open same file multiple times ( one per keys in %myHash ). Let's say %myHash has 10k keys, that means you opening ( without closing ) same file 10k times.

  • Comment on Re^3: <speed a major issue with below code also loosing data while writing in files>
  • Download Code

Replies are listed 'Best First'.
Re^4: <speed a major issue with below code also loosing data while writing in files>
by Anonymous Monk on Jul 26, 2011 at 15:45 UTC
    here u will get the idea wht i am trying to do file name: abc_def_hij.csv data on any line comma seperated as: aaa,abc,aaa,def,aaa,hij…….. etc data on another line comma seperated: xxx,abc,xxx,def,xxx,hij..... etc. now another source file may contain data as coma seperated as : bbb,abc,bbb,def,bbb,hij .... etc so the data from another file should got appended in abc_def_hij.csv file. i will be g8t full if u solve this another thing if i exclude strict & warning every thing is working fine but it is taking app 5min to read 1 million records to create such files from it

      Can you post your new code.

      Also you said it takes 5 minutes to process 1M records, so what is your expected mark?

        Dear Monk, belive me The code is same there is no single line change. what evere i posted was that only. magically it was running fine without strict & warning. speed was a major concern. it should finish this within 1 min expected as my C code is also taking app 2 min to process that big file.pls help me in this. below is the code use POSIX; $num_args = $#ARGV + 1; if ($num_args != 1) { print "\nUsage: Spool.pl Require two argumrnts \n"; exit; } my @SMSFileList = `ls | grep CSV`; chop(@SMSFileList); my $FileCount = 0; my @lines; my ($lines,$CDR,$CDR1,$uniquekey,$uniquekey1,$file_name,$file_name1,$CircleGroupHandle,$CircleGroupHandle1,$uniq1,$uniq2,$fh,$fh1); my $targetdir=$ARGV[0]; foreach my $SMSFileName (@SMSFileList) { $FileCount++; sysopen (SOURCE_SMS_FILE,"$SMSFileName",O_RDONLY) or die "Error opening $SMSFileName"; my %myHash; my %myHash1; while(<SOURCE_SMS_FILE>) { if ($_ =~ m/^,/) { my @lines= split(",",$_); chop($_); if ($lines68 =~ /^I$/) { $uniquekey= $lines17.$lines68.$lines21.$lines44.substr($lines27,0,8); #actual content of file having filename also# $CDR = substr($lines27,0,8).','.$lines28.','.$lines36.','.$lines23.','.$lines24.','.$lines91.','.$lines92.','.$lines101.','.$lines15.','.$lines18.','.$lines75.','.$lines21.','.$lines44.','.$lines14.','.substr($lines69,0,8).','.$lines1.','.$lines13.','.$lines50; #filename from content# $file_name = $lines17.'_'.$lines68.'_'.$lines21.'_'.$lines44.'_'.substr($lines27,0,8); $myHash{$uniquekey}++; } if ($lines68 =~ /^O$/) { $uniquekey1= $lines17.$lines68.$lines22.$lines45.substr($lines27,0,8); $CDR1 = substr($lines27,0,8).','.$lines28.','.$lines37.','.$lines23.','.$lines24.','.$lines91.','.$lines92.','.$lines101.','.$lines16.','.$lines19.','.$lines75.','.$lines22.','.$lines45.','.$lines14.','.substr($lines69,0,8).','.$lines1.','.$lines13.','.$lines50; $file_name1 = $lines17.'_'.$lines68.'_'.$lines22.'_'.$lines45.'_'.substr($lines27,0,8); $myHash1{$uniquekey1}++; } } foreach $key (keys %myHash) { $uniq1=$file_name; sysopen($CircleGroupHandle,"$ARGV[0]/$uniq1.csv",O_RDWR|O_APPEND|O_CREAT)or die "Error writing to $GroupSMSFileName"; print $CircleGroupHandle "$CDR\n"; } %myHash = (); %$uniquekey = (); $uniquekey = {}; foreach $key1 (keys %myHash1) { $uniq2=$file_name1; sysopen($CircleGroupHandle1,"$ARGV[0]/$uniq2.csv",O_RDWR|O_APPEND|O_CREAT)or die "Error writing to $GroupSMSFileName"; print $CircleGroupHandle1 "$CDR1\n"; } %myHash1 = (); %$uniquekey1 = (); $uniquekey1 = {}; } close(SOURCE_SMS_FILE); close($CircleGroupHandle) or die "Error closing to $CircleGroupHandle"; close($CircleGroupHandle1) or die "Error closing to $CircleGroupHandle"; }