Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
In that case, your process isn't either of those. It's just regular mixed processing; and in reality as its extracting a very large volume of data, it probably qualifies as memory-bound.
BrowserUk, thank you for your comment. Yeah, I guess "regular mixed processing" is a pretty fair description of our data extraction processes, but, still, most are usually more IO-bound than anything else. A few of them are algorithmically more complex or may for example require a certain amount of sorting, so these might be more mixed processing and possibly also memory-bound to a certain extent.

But I do not think that "memory-bound" is right for most of our processes. Basically (and with some simplification, because there is quite an amount of business logic involved), the typical application we are running is doing something like this: reading every active subscriber from the database subscriber table; for each of these cell phone subscribers, going into two or three dozens other tables to look for specific more detailed information about the billing services, network services, supplement services, commercial segment, rate plan applicable, prepaid amounts, last bill date, next bill date, etc., for this subscriber, and, once all the relevant information about this subscriber has been collected, write it into a CSV flat file that will be used for further processing later on.

The CSV line we are writing for one subscriber rarely exceeds several hundred bytes (a few thousands at most), but it still leads to large data volume, because there are about 35 million subscribers to be extracted, so that we are producing files ranging from several GB to tens of GB.

At no point in this process do most of these programs use directly a lot of memory. Very little in fact, usually. But, as I mentioned in my previous post, the underlying database engine and the system may use quite a bit of memory for IO buffering, data caching, transaction maintenance and so on, but these are things on which we have only limited or no control.

Having said that, there are some exceptions and some of our programs need to load a lot of reference data into memory (at least three parameter table associated with the call rating engine exceed one million records), but these programs are very different and do not require parallel processing, because they don't scan the full customer database but usually reprocess files (error calls, unallocated calls, error logs) whose sizes never exceed half a GB and are usually much smaller (typically a few MB).

Also note that I discussed only one of our regular activities on one specific platform (two servers), we are doing many other things on other platforms, other applications and other OS's, but this activity is more or less the only one (that I know of) in which we really need to fine tune as much as we can a lot of parallel processing to improve performance.

Je suis Charlie.

In reply to Re^3: Useful number of childs revisited [SOLVED] by Laurent_R
in thread Useful number of childs revisited [SOLVED] by karlgoethebier

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2024-04-25 23:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found