perl -0pe 's/pattern1|pattern2/$1/gs'; apparently the "or" operator doesn't work as expected here.
The | alternation operator has pretty low precedence, so it kind of depends on what your expectations are :-) A common trap is to write something like /^foo|bar$/ and expect that to match only "foo" or "bar", when in fact it is matching ^foo or bar$ - the correct way to express that would have been /^(foo|bar)$/ or /^(?:foo|bar)$/.
Based on your $1 in the replacement, I suspect you were doing something like s/f(o)o|b(a)r/$1/g and expecting the string "bar" to be turned into "a"? In that case, you need the "branch reset" pattern (?|...) (perlre): s/(?|f(o)o|b(a)r)/$1/g will replace "foo" with "o" and "bar" with "a".
| [reply] [d/l] [select] |
Thank you, haukex, for the information about the correct use of the branch reset, and sorry for making you guess my incorrect code. It was (and is) a bit of a moot point, since I've been given multiple perl solutions to my problem; however, for the record, my aforementioned failed attempt was to combine the ideas I had for what in my original post I described as the former and latter cases:
perl -0pe 's/(.*?\n*?([^ \n].*?)\n\n.*|.*?\n([^ \n].*?)\n\n.*)/$1/gs'
+file.txt
This simply returned the whole file, and I thought it had something to do with the nested parentheses altering what $1 refers to. When I rewrite it correctly, as per haukex's helpful advice:
perl -0pe 's/(?|.*?\n*?([^ \n].*?)\n\n.*|.*?\n([^ \n].*?)\n\n.*)/$1/gs
+' file.txt
it does return something, but that something is the desired output only in the case where the first line is non-indented. The reason, obviously, is that I was making a stupid mistake: the first alternative returns a match whether or not the first line of file.txt is indented, so that code is never going to return the output of:
perl -0pe 's/.*?\n([^ \n].*?)\n\n.*/$1/gs' file.txt
As I said, it's a moot point, but I thought this follow-up was worth mentioning in case someone who lands on this page learns from my stupid mistake.
Regards,
Maneesh
| [reply] [d/l] [select] |
s/(?:foo|bar)/baz/;
(non-capturing group)
The way forward always starts with a minimal test.
| [reply] [d/l] |
| [reply] [d/l] [select] |