in reply to Re: Re: RFC: proposed new module VCS::Lite
in thread RFC: proposed new module VCS::Lite

I don't tend to think of files as lists of lines. The line separator is totally arbitrary and I always slurp files into scalars. I think you should work with scalars and just split them into arrays when you need to feed them to Algorithm::Diff.

I also don't think that working with arrays will make a difference to binary files. Binary diff utilities do exist, and I believe they just treat the file as a stream of bytes and use offsets.

  • Comment on Re: Re: Re: RFC: proposed new module VCS::Lite

Replies are listed 'Best First'.
Re^4: RFC: proposed new module VCS::Lite
by Aristotle (Chancellor) on Dec 18, 2002 at 20:28 UTC
    Binary diff is trivially a special case of line oriented diff if you just change the definition of a line from "an arbitrarily long sequence of bytes terminated by the line separator" to "a single byte".

    Makeshifts last the longest.

Re: Re: Re: Re: RFC: proposed new module VCS::Lite
by rinceWind (Monsignor) on Dec 19, 2002 at 10:11 UTC
    I guess my VMS background is showing here (the VMS concept of a file is a record stream, not a byte stream). Besides, the diamond <> operator works on lines unless you change $/.

    In terms of binary files, I was thinking of RTF or word documents - something for which insertion and deletion could be valid operations.

    You are right about binary files generally, as without insertion/deletion, Algorithm::Diff is not appropriate. However, there's no reason not to use the same API (objects, diff, patch, merge) to do in-situ comparison. This can be achieved by subclassing VCS::Lite and providing some new methods.