Re: Searching Huge files

My first thought is to see if DBD::CSV would be able to handle this. I suspect it could. Then you'd be able to merely issue an SQL statement to get the data you need. This will likely be the biggest bang for the buck - though that probably won't be much bang, it also won't be much buck (cheap to implement).

Once you have that working, the next step may be to simply load this data into a real database (whether sqlite on one end or DB2 or Oracle or whatever at the other end), which should be fast, and then ask the database to return the result for what should be basically the same query. This will take a bit more to set up, so if it saves anything, it will need to be significant enough to overcome the setup cost (copying of data into the database) just to be worth it. This will take longer (setting up a real database server, even if it's just sqlite), so the return on investment may not be as high - depends on how important it is to you for your transform to perform quickly. Even here, there'll be room for tweaking, based on how you set up your indexes, etc. - if you have in-house database experts, you may want their input on this.

Just my two cents :-)

Comment on Re: Searching Huge files