in reply to Applying diff partially using perl

mhd:

I think that if I had to automate that task, I'd generate the diffs automatically, then write a chain of small perl scripts to implement your rules by editing the diff patches. So rather than trying to build a big complicated program, generate your diff patches. Then, as you suggest in your idea: Figure out how to edit the diffs to accomplish your rules.

While Text::Diff appears to be a nice package, you might want to look at GNU diff, as it has a nice set of options. You can fine-tune its output, and even perform some filtering up-front, such as:

-I RE --ignore-matching-lines=RE Ignore changes whose lines all match RE.

For the rule to not edit the bytecodes: I'd scan the file containing the bytecodes for the beginning and ending lines of the bytecode array, and delete diff edits between those lines. (If you have control of the input file, you might add a BEGIN and END type of comment to simplify finding the code blocks you don't want patch to touch, otherwise, you may have to create a set of regexes.)

To add new prototypes from one file without deleting ones that are removed, simply remove any delete edits in the diff for those particular header files. So for this, you might have a list of files for which you remove all deletes.

Finally, regarding your question "Or is this attempt-to-automatize thing impossible to be done by machine?": Many times, text is just too free-form to automate everything. But getting a 90% solution is usually quick and easy. Then you need only raise an alert when the program doesn't know what to do for a specific case. I tend to do jobs like this iteratively. As I do a job, I try to find a way to automate either (a) the most annoying case, or (b) the simplest win. Then on the next iteration, I again find something annoying to automate...and so on. In just a few iterations, you'll probably get enough cases covered that you'll find weeks (months...) between alerts.

Be sure to write your code and/or documentation very clearly! Projects like this are the kind that are a "pick it up and put it down" sort. With infrequent edits to the program, you need to make sure you don't break any rules. I find unit-tests (in the mode of Test Driven Development) to be very helpful here. That way, you can make sure you don't break edits you've already handled.

I hope this helps...

...roboticus

Replies are listed 'Best First'.
Re^2: Applying diff partially using perl
by mhd (Novice) on Sep 06, 2008 at 12:44 UTC
    Roboticus, thanks for your reply...

    I thought gnu diff would become my saviour for 60-70% of the solution. Boy, I was wrong... the -I features is almost useless. CMIIW, but I think gnu diff uses damn primitive POSIX basic regex (BRE). Maybe those gnu programmers thought "ah no one need this, we'll just put this option as a nice-to-have-but-crippled option". Well,guess what? maybe 99.5% don't need. But I'm part of the 0.5%.
    I can't get the regex pattern right for some quite simple text,let alone complex.

    Here my first pattern attempt.

    Full line:
    {(CONST method_info*)0x2f05/*comment*/,0x0}

    True line that I want to match:
    {(CONST method_info*)0x2f05

    The most that I can do using posix BRE:

    -I "^[:space:]*{(CONST method_info\*)0x[:xdigit:]\{1,8\}"
    My re above still not matched. Text still not ignored in diff

    Any improvement suggestion for my re will be highly,highly, highly appreciated. Now, if someone could explain this in ordinary english language cause english isn't my native and my brain is too slow to comprehend the document's meaning.