Re: Re: Re: Compare2Files LinebyLine

Replies are listed 'Best First'.
Re: Re: Re: Re: Compare2Files LinebyLine by zoot (Initiate) on Feb 17, 2003 at 20:25 UTC
Hi Folks. Do you guys happen to have any suggestions for comparing 2 files line by line that don't involve loading all the lines into memory? I'm trying to compare two files that are each over 300MB in size. My system doesn't have enough memory to handle loading all the file lines into a hash. I've tried the readline approach but it takes forever to run. Unfortunately, I'm not able to load the data into a database either - even a Berkeley DB. Any ideas would be appreciated.	[reply]
Re: Re: Re: Re: Re: Compare2Files LinebyLine by BrowserUk (Patriarch) on Feb 17, 2003 at 21:21 UTC
There are ways of approaching the problem, but you need to state what it is that you are looking for in the comparison. Do you want to know which lines matched or which ones didn't? Are the files in a similar sequence with just additional lines or deleted or changed lines? Or do you need to know if any line in one file appears anywhere in the other? Depending on your answers, an algorithm appropriate maybe forthcoming. Examine what is said, not who speaks. The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.	[reply]
Re: Re: Re: Re: Compare2Files LinebyLine by thesundayman (Novice) on Sep 27, 2001 at 16:06 UTC
Right as always :-)	[reply]