in reply to Re^4: Best way to match a hash with large CSV file
in thread Best way to match a hash with large CSV file

Ok, let's go! Give me a FTP drop site and I will give you a .zip file with data and code!

I just ran a easy test for me with 818 files, 266,551 DB "rows", < 100 seconds to read all files, created DB, index DB in more ways from Sunday.

  • Comment on Re^5: Best way to match a hash with large CSV file

Replies are listed 'Best First'.
Re^6: Best way to match a hash with large CSV file
by BrowserUk (Patriarch) on Nov 06, 2011 at 13:42 UTC
    < 100 seconds

    ... isn't less than 6 seconds.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      opps..this is not an "apples vs oranges" comparison.

      Talking about 100 seconds is a mistake.
      I don't think that anybody said anything about X seconds.

      The OP asked: The problem with this approach is that the CSV file (40MB) is loaded to DBI engine 5,000 times and takes hours to process.

      Answer is NO!: The answer that the OP wants can be had very fast and it is related to to the indexing of the database.