in reply to Re: Re: Compare2Files LinebyLine
in thread Compare2Files LinebyLine

The sorting and the splicing in fact add some complexity that mine doesn't, so it's still a lot more for you than O(n).

-- Randal L. Schwartz, Perl hacker

Replies are listed 'Best First'.
Re: Re: Re: Re: Compare2Files LinebyLine
by zoot (Initiate) on Feb 17, 2003 at 20:25 UTC
    Hi Folks. Do you guys happen to have any suggestions for comparing 2 files line by line that don't involve loading all the lines into memory? I'm trying to compare two files that are each over 300MB in size. My system doesn't have enough memory to handle loading all the file lines into a hash. I've tried the readline approach but it takes forever to run. Unfortunately, I'm not able to load the data into a database either - even a Berkeley DB. Any ideas would be appreciated.

      There are ways of approaching the problem, but you need to state what it is that you are looking for in the comparison.

      Do you want to know which lines matched or which ones didn't?

      Are the files in a similar sequence with just additional lines or deleted or changed lines? Or do you need to know if any line in one file appears anywhere in the other?

      Depending on your answers, an algorithm appropriate maybe forthcoming.


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: Re: Re: Re: Compare2Files LinebyLine
by thesundayman (Novice) on Sep 27, 2001 at 16:06 UTC
    Right as always :-)