in reply to Re: Delay when write to large number of file
in thread best way to fast write to large number of files

Dear Corion, thanks for replay.

its good idea if we have low number of client.

The problem is we have more than 5M subscriber (with more than 2.7M active client), and we have Continuous input of logs (about 4~5 file every minute), with every log file has ~18K raw.

so its very rare to have client made more than 3 record in the same log file, which mean it make no much difference if sorting the data or not.

-- but it will be good idea if we add more than one log file together in that list @RECList.

-- now, another Question. how many raw we can add the array? or what is the limit size of array in perl?

also, I know that opreating system has some fault in this (windows 7 & windows server 2008) we relay wish to use linux but cannot :(

and I know the slowness are from open and close too many file, I try to find any other way to write to files other than this but couldn't find any (I'm beginner in perl ~ but I like it very much ^_^).

BR

Hosen

  • Comment on Re^2: Delay when write to large number of file

Replies are listed 'Best First'.
Re^3: Delay when write to large number of file
by Corion (Patriarch) on Jun 23, 2014 at 15:19 UTC

    As it currently is, you are doing 18k open+close per file. Open and close are slow on Windows. With my approach, you will reduce the number of open+close. If you process more than one file before writing the output, you can reduce the number of open+close per client even more.

    Perl has no limit for the array size other than available memory.

Re^3: Delay when write to large number of file
by wjw (Priest) on Jun 23, 2014 at 17:40 UTC

    This thought may be way off topic here, but if you can't change OS, and for some reason can't employ Corions solution, perhaps you might consider the hardware itself? There are some SSD (Solid State Drives) which reportedly increase performance 100x over spinning drives. If one was to carefully select an SSD that has been test proven to perform with your flavor of OS, you might see some gain that way. If I understand your numbers right, even a 10x performance increase would help you out.

    just a thought...

    (honestly, I think the solution that Corion referred to is the way to go)

    ...the majority is always wrong, and always the last to know about it...

    Insanity: Doing the same thing over and over again and expecting different results...

    A solution is nothing more than a clearly stated problem...otherwise, the problem is not a problem, it is a facct

Re^3: Delay when write to large number of file
by Anonymous Monk on Jun 24, 2014 at 07:20 UTC

    What corion said :)

    On my old 2006 laptop with 3500rpm harddisk ... processing/printing 18k records with a single open/close takes under four seconds consistently

    Doing an extra 18k open/close it takes twice as long or longer ( 7-27)

      Dear Friend,

      Are you say that the next script will finish in less than 30 Sec (with 18K open/close)?

      @RECList = ..... clientname,record sam,plaplapla jame,bobobo kate,sososo ..... print "FLASH A-LIST\n"; foreach my $CDR (@RECList){ my ($filename,$row)= split(/,/, $CDR); open(my $csv_fh, ">> /ClientRrecord/$filename.csv") or die "couldn +\'t open [$filename.csv]\n".$!; print { $csv_fh } $row."\n"; }
        That code don't compile, but a version that does compile is really that fast