2. Having said that about printing @data, this is NOT what you want to do! grandfather's code reads the text2 file one line at a time and creates a hash table. It does NOT save a verbatim copy of either the text2 or text1 input files into an array!
3. Create 2 small files, say 100 lines each and get grandfather's code running on your machine. The code will run in a few seconds. Then turn it loose on the full size files that you have. The FIRST STEP before optimizing is to get running code!
From looking at the code, I doubt that you will see much difference between 100 lines and 10,000 lines in file2. I suspect that this thing will run in much less than 10 seconds. If the program runs within what is acceptable time frame to you, there is probably no need to optimize it.
4. HUGE is relative! This program algorithm will not slow down appreciably until the size of the hash of file 2 (the smallest file) exceeds what you can have memory resident. I just opened one of my apps that creates a hash table of about 120K entries and sorts/displays in a Tk GUI, takes less than 0.5 seconds and the processing that is being done is FAR more than in your application.
5. So get working code with small set of data and then report back about problems and size issues when you scale it.
In reply to Re^4: Read two files and print
by Marshall
in thread Read two files and print
by sandy1028
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |