Aldebaran has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

Have need for a quick script, but it's just a bit too complicated for me. What I seek is a script that renumbers all these paragraphs correctly and makes the spacing uniform between paragraphs.

Here's the target text. I would expect it to come to the script by means of ARGV:

$ pwd /home/bob/Desktop $ ls 2.lease.txt 2.lease.txt $ cat 2.lease.txt ... 21. ASSIGNMENT: RESIDENT agrees not to transfer, assign or sublet the +premises or any part thereof. 22. PARTIAL INVALIDITY: Nothing contained in this Agreement shall be c +onstrued as waiving any of the OWNER'S or RESIDENT'S rights under the + law. If any part of this Agreement shall be in conflict with the law +, that part shall be void to the extent that it is in conflict, but s +hall not invalidate this Agreement nor shall it affect the validity o +r enforceability of any other provision of this Agreement. 32. RECEIPT OF AGREEMENT: The undersigned RESIDENTS have read and unde +rstand this Agreement and hereby acknowledge receipt of a copy of thi +s Barter Agreement. RESIDENT'S Signature _________________________________________________ +__ Date__________________ OWNER'S or Agent's Signature _________________________________________ +___ Date__________________ $

The match criteria are that it matches an integer followed by a period. I want them re-numbered, beginning with 1. Between paragraphs should be only one blank line.

Thanks for your comment,

Replies are listed 'Best First'.
Re: a simple match and replace script
by talexb (Chancellor) on Aug 08, 2017 at 19:43 UTC

    It sounds like what you want to do is specify that the input record separator be any number of blank lines. Then, for each record, replace the number present with a variable that gets incremented after each paragraph. Finally, output the record, followed by a blank line.

    Alex / talexb / Toronto

    Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      but every newline does not imply a new paragraph, as in #32, where we have extra space for pen and paper sigs. It might look better on the page to have 2 newlines between numbered terms, might give room for changes and initials.

        OK -- I'd start with something that works (puts one line between every paragraph), then improve that. For example, add two lines after each numbered paragraph -- otherwise, just add one line.

        It's not a requirement (ever) to get right to the golden solution on the first try. Get something that works, even if there's room for improvement. Then .. improve what you have. Repeat as necessary.

        Alex / talexb / Toronto

        Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

Re: a simple match and replace script
by Athanasius (Archbishop) on Aug 09, 2017 at 03:48 UTC

    Hello Datz_cozee75,

    If your document isn’t too long, this approach should do what you want:

    use strict; use warnings; use Data::Dump; my $first = 1; my $clause_number = 1; my @paragraphs; while (my $line = <DATA>) { if ($line =~ / ^ \d+ \. /x) { $line =~ s/ ^ \d+ /$clause_number/x; push @paragraphs, [ $line ]; ++$clause_number; $first = 0; } elsif ($first) { push @paragraphs, [ $line ]; $first = 0; } else { push @{ $paragraphs[-1] }, $line; } } for my $p (@paragraphs) { pop @$p while $p->[-1] eq "\n"; } dd \@paragraphs; __DATA__ Title 21. ASSIGNMENT: RESIDENT agrees not to transfer, assign or sublet the +premises or any part thereof. 22. PARTIAL INVALIDITY: Nothing contained in this Agreement shall be c +onstrued as waiving any of the OWNER'S or RESIDENT'S rights under the + law. If any part of this Agreement shall be in conflict with the law +, that part shall be void to the extent that it is in conflict, but s +hall not invalidate this Agreement nor shall it affect the validity o +r enforceability of any other provision of this Agreement. 32. RECEIPT OF AGREEMENT: The undersigned RESIDENTS have read and unde +rstand this Agreement and hereby acknowledge receipt of a copy of thi +s Barter Agreement. RESIDENT'S Signature _________________________________________________ +__ Date__________________ OWNER'S or Agent's Signature _________________________________________ +___ Date__________________

    Output:

    Then you can output @paragraphs with as many blank lines between each paragraph as you choose.

    Update: Changed $paragraph to $clause_number in line with talexb’s good advice below.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Quick comment: I would replace the scalar $paragraph with something a little less confusing like $clause_number, since you already have an array called @paragraphs.

      Alex / talexb / Toronto

      Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.