Re: Re: Re: Re: Optimising processing for large data files.

Example 1.

...nothing to do with whether you are using true garbage collection.

I never used the phrase "true garbage collection".

Example 2.

You also included a false assertion about when databases can give a performance improvement.

Wrong. To quote you: "Sure, databases would not help with this problem."
Example 3.

Consider the case where you have a very large table,...

No, I will not consider that case. That case has no relevance to this discussion, nor to any assertions I made.
My assertion, in the context of the post (re-read the title!) was:
If you have a large volume of data in a flat file, and you need to process that data in it's entirety, then moving that data into a database will never allow you to process it faster.
That is the assertion I made. That is the only assertion I made with regard to databases.
Unless you can use some (fairly simple, so that it can be encapsulated into an SQL query) criteria to reduce the volume of the data that the application needs to process, moving the data into a DB will not help.
No matter how you cut it, switch it around and mix it up. For any given volume of data that an application needs to process, reading that volume of data from a flat file will always be quicker than retrieving it from a DB. Full stop.
No amount of what-if scenarios will change that nor correct any misassertion I didn't make.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

Comment on Re: Re: Re: Re: Optimising processing for large data files.

Replies are listed 'Best First'.

Re: Re: Re: Re: Re: Optimising processing for large data files.
by tilly (Archbishop) on Apr 11, 2004 at 07:26 UTC

Example 1.
...nothing to do with whether you are using true garbage collection.
I never used the phrase "true garbage collection".

The process consumed less than 2MB of memory total. There was no memory growth and the GC never had to run.

Example 2.
You also included a false assertion about when databases can give a performance improvement.
Wrong. To quote you: "Sure, databases would not help with this problem."

Databases are never quicker unless you can use some fairly simplistic criteria to make wholesale reductions in the volume of the data that you need to process within your application program.

Example 3.
Consider the case where you have a very large table,...
No, I will not consider that case. That case has no relevance to this discussion, nor to any assertions I made.

Databases are never quicker unless you can use some fairly simplistic criteria to make wholesale reductions in the volume of the data that you need to process within your application program.

My assertion, in the context of the post (re-read the title!) was:
If you have a large volume of data in a flat file, and you need to process that data in it's entirety, then moving that data into a database will never allow you to process it faster.

Another example which comes to mind is having to sort a very large dataset. (As in several GB of data.) A lot of research has gone into efficient sorting algorithms, and a lot of that research has gone into database design. Again, moving data into the database can win.

That is the assertion I made. That is the only assertion I made with regard to databases.
Unless you can use some (fairly simple, so that it can be encapsulated into an SQL query) criteria to reduce the volume of the data that the application needs to process, moving the data into a DB will not help.

I've given an example where it happens, and I've pointed you at an area of work where people customarily run into this issue.

No matter how you cut it, switch it around and mix it up. For any given volume of data that an application needs to process, reading that volume of data from a flat file will always be quicker than retrieving it from a DB. Full stop.

in addition

Obviously unless the database is a good match for what you are going to do, and Perl is not, you would be insane to add that overhead to your process.

But if it is a match, then the database can win. Sometimes by a lot. Despite its obvious overhead.

No amount of what-if scenarios will change that nor correct any misassertion I didn't make.

Now you can end the thread however you wish to.

[reply]


Think about Loose Coupling
	PerlMonks