wenD has asked for the wisdom of the Perl Monks concerning the following question:

I need to begin sending data to a business partner as delta files. Previously, I've been sending them all available data on a monthly basis. But now, they would like to receive data only for the items that have changed.

My plan is to output a full file and compare that to last month's file. Lines from $current_file not found in $previous_file would be written to $delta_file.

I have some ideas on how to go about it but since my initial searches for "delta file" on CPAN and Perl Monks didn't turn up anything really obvious, I thought I would throw this out for discussion.
  • Comment on What is the best way to produce a delta data file?

Replies are listed 'Best First'.
Re: What is the best way to produce a delta data file?
by gamache (Friar) on Nov 13, 2007 at 17:55 UTC
    I can't think of an easier solution than diff. It's a Unix utility, not Perl-based.
Re: What is the best way to produce a delta data file?
by Your Mother (Archbishop) on Nov 13, 2007 at 18:13 UTC

    I use Algorithm::Diff for revision history in web apps. There are related modules you'll find from the POD.

Re: What is the best way to produce a delta data file?
by moritz (Cardinal) on Nov 13, 2007 at 17:57 UTC
    You should really look at the unix command diff, it produces deltas of text files, and you can apply them with patch. Search for diff on CPAN will find some perl implementations or interfaces.

    Alternatively you could use some kind of version control system like subversion (svn) or git.

Re: What is the best way to produce a delta data file?
by KurtSchwind (Chaplain) on Nov 13, 2007 at 19:34 UTC

    I see others have mentioned 'diff', which I also would agree is the way to go. There is even a windows version available via cygwin.

    The major caveat is that you'll want to sort each file before diffing. diff can do you wrong on unsorted files.

    --
    I used to drive a Heisenbergmobile, but every time I looked at the speedometer, I got lost.
Re: What is the best way to produce a delta data file?
by tilly (Archbishop) on Nov 13, 2007 at 23:16 UTC
    Where are you getting data from? A database?

    If so then you might want to look at whether you can generate an incremental data set directly. Otherwise you can use diff as others have suggested.