I do not know enough about your requirements to figure out if my suggested ideas make sense. The question is: do you need your client files to be updated every minute or even every 10 minutes or even every hour? Probably not, I would suspect you need to process the log files quite often, but not necessary update your client files so often.

Based on these assumptions, I can think of two general types of solution.

One is to read the log files and store the daily activity into a database and to download into the client files the database content once per day (or pick up any other time interval better suiting your needs). The advantage is that the overhead of opening so many files occurs only once per day.

Another idea is to pseudo-hash your client logs into temporary files. For example, you could store into a file all logs concerning client whose customer number ends with 00. In another file logs pertaining to clients whose customer number ends with 01. And so on until 99. So that each time you read a log, and assuming you sorted the log by the last two digits of your customer number, you only need to to open for write only 100 files, which will mean much less overhead than 18K files. Then, again, once per day (or whatever better schedule fits your needs better), you process these temporary files to put the records into the final client files. I am fairly sure that using such a mechanism would give you a huge gain.

Of course, the idea of using 100 temporary files per day and process them once per day are just random numbers that I picked up because they made some sense to me. You may want to change both numbers so something else if it makes more sense to your case. It could be once per hour, and it could be more temporary files or less temporary files. You have to figure out the best combination based on your knowledge of the situation and actual tests on the data.


In reply to Re: best way to fast write to large number of files by Laurent_R
in thread best way to fast write to large number of files by Hosen1989

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.