in reply to Re^2: Netflix (or on handling large amounts of data efficiently in perl)
in thread Netflix (or on handling large amounts of data efficiently in perl)

Random tip. Try http://strawberryperl.com/ and see if it lessens the pain of Windows.

A more technical tip. Try sorting your data and using a btree format for your data. With a hash you do a lot of seeking to disk, and seeks to disk are slow. 1/200th of a second per seek may not sound like a lot, but try doing 100 million of them and you will take the better part of a week. But a btree loaded and accessed in close to sorted order does lots of streaming to/from disk and that is quite fast. (And a merge sort streams data very well.)

  • Comment on Re^3: Netflix (or on handling large amounts of data efficiently in perl)