Beginner question about search and replace

hilbert has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Beginner question about search and replace
by moritz (Cardinal) on Sep 16, 2011 at 10:10 UTC

First let me explain why your solution doesn't work as expected:

s///g starts each match where the previous match left off, or at the start of the string for the first match.

The first match finds 1 2, and the next match searches for a \d, which is the 3 -- no overlap occurs.

A possible fix is to not match the second digit:

s/(\d)\s+/$1;/g;
[download]

Which produces the output you want. If it's important that no substitution happens after a number but before a letter, you can use

s/(\d)\s+(?=\d)/$1;/g;
[download]

The (?=\d) looks for a digit, but it doesn't consume it (search for look-ahead in perlre).

Update: Kudos for supplying your code, actual output and expected output. It makes answering your question easy, and can't be taken for granted. Welcome to perlmonks!

Perl 6 - second systems done right

[reply]
[d/l]
[select]

Re^2: Beginner question about search and replace

by hilbert (Acolyte) on Sep 16, 2011 at 11:47 UTC

Thanks a lot for your absolutely clear explanation and solution.

[reply]

Re^2: Beginner question about search and replace

by Anonymous Monk on Sep 16, 2011 at 17:36 UTC

The first form will result in a trailing ';' because newline is part of the whitespace character class. (\s)

-Greg

[reply]

Re: Beginner question about search and replace
by choroba (Cardinal) on Sep 16, 2011 at 10:11 UTC

s/(?<=[0-9]) (?=[0-9])/;/g
[download]

perlre

[reply]
[d/l]

Re^2: Beginner question about search and replace

by hilbert (Acolyte) on Sep 16, 2011 at 11:59 UTC

Thanks a lot!

[reply]

Re: Beginner question about search and replace
by luis.roca (Deacon) on Sep 16, 2011 at 12:30 UTC

As moritz explained you were matching every pair of numbers when you wanted every number with a space following it.

Another way to replace every space following a number with a semicolon could be *(untested):

s/(\d)(?:\s+)/$1;/g

The \d matches any single digit and is the same as [0-9]. As you learned in your attempt, the parentheses 'capture' that match and store it in memory so we can use it later. The second part of our match looks for at least one space following that digit but doesn't store it in memory because of the (?: ) since we're not planning on using it to help build our replace pattern.

Aside from the perldocs on regexes, if you're interested, you might like Mastering Regular Expressions by Jeffrey E.F. Friedl and/or Data Munging with Perl by Dave Cross.

UPDATE

Same day, 16.Sep.2011 :: 02:35:24 PM :: Changed: s/(\d)(?:\s)+/$1;/g To : s/(\d)(?:\s+)/$1;/g Following AnomalousMonk's suggestion.

"...the adversities born of well-placed thoughts should be considered mercies rather than misfortunes." — Don Quixote

[reply]
[d/l]
[select]

Re^2: Beginner question about search and replace

by Kc12349 (Monk) on Sep 16, 2011 at 15:32 UTC

Why the (?: ) around the space character? It runs just fine as below.

s/(\d)\s+/$1;/g;
[download]

[reply]
[d/l]
[select]

Re^3: Beginner question about search and replace

by luis.roca (Deacon) on Sep 16, 2011 at 15:45 UTC

"Why the (?: ) around the space character? It runs just fine as below?"

It does work well and is less complex to explain. I use the (?: ) to show the practice of not capturing matches into memory which wont be used in the replacement. In this specific example memory isn't going to be a problem because we're only dealing with a single string. But I personally like using it even as a way to mark what I want and don't want to work with in the replacement.

"...the adversities born of well-placed thoughts should be considered mercies rather than misfortunes." — Don Quixote

[reply]
[d/l]

Re^4: Beginner question about search and replace

by Kc12349 (Monk) on Sep 16, 2011 at 16:46 UTC

Re^3: Beginner question about search and replace

by AnomalousMonk (Archbishop) on Sep 16, 2011 at 18:30 UTC

I agree with luis.roca's use of (?:\s)+ in a pedagogic or self-documentary context as already explained above.

I would be inclined to quibble with the use of (?:\s)+ versus (?:\s+) especially in a pedogogic example. While these two expressions behave in exactly the same way in all respects AFAIU, the corresponding capturing expressions (\s)+ and (\s+) behave very differently as to the characters captured, and in an explanatory example this might, by suggestion or implication, lead to great confusion.

[reply]
[d/l]
[select]

Re: Beginner question about search and replace
by Anonymous Monk on Sep 16, 2011 at 19:13 UTC

This is an alternative to a look-behind:

s/\d\K\s+(?=\d)/;/g;
[download]

[reply]
[d/l]