Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^2: performance of File Parsing

by Anonymous Monk
on Jul 07, 2011 at 11:26 UTC ( [id://913164]=note: print w/replies, xml ) Need Help??


in reply to Re: performance of File Parsing
in thread performance of File Parsing

Thanks for reply.

But i mean not the parsing the continuously updating file.

My concern is, the file which have lakhs of records each field separated by semicolon. So i need to parse each record and separate the fields and do the some calculation, based on the satisfy condition need to save result into the different files.

And also here i need to do some of fields in different records which satisfy the some condition to the aggregation on those fields, for this i am making hash at end of the file do the aggregation and write into the file.

So on this process for 10 lakhs records, taking time of 3 hours. So i need to do optimize it. So, here not getting idea either reading the line by line of tera byte file taking long time or saving content into memory(hash) at end put into file takes time?

Replies are listed 'Best First'.
Re^3: performance of File Parsing
by sundialsvc4 (Abbot) on Jul 07, 2011 at 12:26 UTC

    What’s killing you, then, is “that enormous hash.”   You need to replace that logic.

    If you were to plot the throughput of this program, it would describe a nice, exponential curve.   When it reaches the “thrash point,” it smashes into the wall and drops dead.   That’s my blindfolded prediction, but I’ll bet I’m right on the money.

    I suggest stuffing the whole thing into an SQLite database (flat-file), and using queries (within transactions).

    “Don’t ‘diddle’ the code to make it faster ... find a better algorithm.”
    – Kernighan & Plauger; The Elements of Programming Style.
Re^3: performance of File Parsing
by GrandFather (Saint) on Jul 08, 2011 at 22:52 UTC

    Maybe you should show us the key component of your code as a small stand alone script and a very small sample of data that is just sufficient demonstrate what your code does. We can help you much more if we know just what you are trying to do than we can when we have to toss up straw men to pitch at.

    True laziness is hard work

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://913164]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (3)
As of 2024-04-25 07:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found