While I sit here waiting for my script to run, I wonder: out there in the rest of the Perl industry, what's considered a large data processing job?
My case today: I have a 650Mb flat text file with 15-million lines of 1's and 0's. I have another flat text file with 600 lines, each line with 7 positions containing 1's and 0's. Each character and position of every line of the first file has to be checked against each character and position of the second file. That is, 15-million lines of text to parse into characters, then 15-million * 600 * 7 = 6.3-billion comparisons to make, and write matches (normally around 100-million) to about 40-thousand "match position list" files on disk.
There are no patterns to look for. I have been unable to find a better method than brute force character-wise comparison. It takes about 8 hours to run.
And I've sometimes wondered - is this outrageous, or just everyday at work for some?
Thanks
In reply to What is a "big job" in the industry? by punch_card_don
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |