Regex to strip, keep, or add

bradcathey has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monasterians;

I'm trying to bring old code into conformity, in particular the way I call subroutines. I'm trying to convert any instances of $somecall to somecall() preserving any passed values in parens.

while (<DATA>) {
   chomp;
   $_ =~ s/^&(\w+)([\(\)]*)/$1()/;
   print "$_\n";
}

__DATA__
&foobar
&foobar()
foobar()
&foobar("asdf")
[download]

which prints:

foobar()
foobar()
foobar()
foobar()"asdf")
[download]

Close, but no cigar. So, tried:

while (<DATA>) {
   chomp;
   $_ =~ s/^(&)(\w+(\([\"\'\$\w]*\))?)(\(\))$/$2()/;
   print "$_\n";
}
[download]

which printed:

&foobar
foobar()
foobar()
&foobar("asdf")
[download]

Again, close, but my &'s are back. What am I missing? Thanks!

UPDATE: Pasted in correct #2 expression (thanks davidrw)

—Brad
"The important work of moving the world forward does not wait to be done by perfect men." George Eliot

Comment on Regex to strip, keep, or add—depending Select or Download Code

Replies are listed 'Best First'.
Re: Regex to strip, keep, or add—depending by davidrw (Prior) on Aug 06, 2005 at 15:08 UTC
your first and second regex's are identical ... what was the actual code for the second attempt? Update: This works: `$_ =~ s/^&(\w+)([()]*)/ $1 . ($2 \|\| '()') /e;` Though i'm curious to see someone post a solution that doesn't require the /e modifier...	[reply] [d/l]
Re: Regex to strip, keep, or add—depending by JediWizard (Deacon) on Aug 06, 2005 at 15:41 UTC
`$_ =~ s/^&(\w+)([])/$1()/;` [download] The above matches : An amperstand (at the beging of the string/line) followed by one or more letters or numbers (in group 1) folowed by and number of parenthesis (in group 2). `&foobar("asdf") # will only match &foobar(` [download] It appears you are trying to use perl code to modify perl code. This is not usually a safe idea (unless you really really* know what you are doing). The Perl parser is very complex, and will account for more things than you are going to want to deal with in a regex. The having been said... the following I believe will do as close as I can get to what you mean `s/^(&\w+)$?((?:(?!;).)*)$?;/$1($2);/` [download] But keep in mind... this will fail in a situation like: `&foobar("a", 5, qw(dave jen anne matt), map({tr/[a-z]/[A-Z]/; $_;} @mi +xedCase));` [download] The perl parser will understand that.... but the supplied regex will not. They say that time changes things, but you actually have to change them yourself. —Andy Warhol	[reply] [d/l] [select]
Re^2: Regex to strip, keep, or add—depending by bradcathey (Prior) on Aug 06, 2005 at 15:53 UTC
It appears you are trying to use perl code to modify perl code. This is not usually a safe idea (unless you really really know what you are doing). The Perl parser is very complex, and will account for more things than you are going to want to deal with in a regex. Actually I'm grepping it in BBEdit's Find and Replace function. Will not be using it in Perl. —Brad "The important work of moving the world forward does not wait to be done by perfect men." George Eliot	[reply]
Re^3: Regex to strip, keep, or add—depending by JediWizard (Deacon) on Aug 06, 2005 at 16:08 UTC
I am not familiar with BBEdit, but if you are not using perl... why are you asking the question at perlmonks? regex's in perl may not be the same as regex's in other languages, or other programs. They say that time changes things, but you actually have to change them yourself. —Andy Warhol	[reply] [d/l]
Re: Regex to strip, keep, or add—depending by chester (Hermit) on Aug 06, 2005 at 16:32 UTC
This really seems like a job for Perl rather than a text editor's internal find/replace, especially since you're playing with parentheses; nested parens are notoriously difficult to regex properly. The following monstrosity accounts for nested parens, and also isn't limited to the start of line; it tries to avoid messing with things that don't look like function calls. This DOESN'T work on the last example (nested function calls which need to be fixed). I'm sure there are some other quirks to Perl syntax which this doesn't catch. use Regexp::Common qw/balanced/; while (<DATA>) { chomp; my $bp = qr/$RE{balanced}{-parens=>'()'}{-keep}/; $_ =~ s{ &? # optional & ( # capture either: (?<=&) \w+ # bareword preceeded by & \| # or \w+ (?=\s\() # bareword followed by ( ) # (?: \s (?=\())? # optional white space, if followed by ( (?: # either: $bp # balanced parens \| # or # nothing! ) } # if no second capture, supply an empty () { $1 . ($2 \|\| '()') }exg; print "$_\n"; } __DATA__ &foobar &foobar() foobar() &foobar("asdf") &foobar( test(), qw(etc.)) &foobar( test(test() . test()) ) &foobar("a", 5, qw(dave jen anne matt), map({tr/[a-z]/[A-Z]/; $_;} @mi +xedCase)); &dont_change if not &function() &foobar ('<- white space?') &foobar( &test_nested ) [download]	[reply] [d/l]
Re^2: Regex to strip, keep, or add—depending by JediWizard (Deacon) on Aug 06, 2005 at 16:46 UTC
FYI... The example I gave will not have issues with nested parens. * (being greedy) would match the last paren followed by a semi-colon (thus nested parens would not be an issue (unless there were nested parens with in the nested function call)). Further proving my point that the perl parser will handle more than you would ever want to try with a regex. Update: I am wrong. The regex I posted will actually match the first ) followed by a ;... which will break in a nested function call. However nested parens aside... my original point, that you (porbably) do not want to attempt to account for everything the perl parser will account for, remains valid. They say that time changes things, but you actually have to change them yourself. —Andy Warhol	[reply] [d/l]
Re^3: Regex to strip, keep, or add—depending by chester (Hermit) on Aug 06, 2005 at 17:27 UTC
I'm in full agreement, it's unlikely that any regex will work in all cases. Mine certainly doesn't. Catching nested parens would be as far as I'd want to take it; if I was doing this for my own purposes, probably not even that far. But I wanted to try it for educational purposes (educating myself, that is).	[reply]
Re^4: Regex to strip, keep, or add—depending by BUU (Prior) on Aug 06, 2005 at 22:50 UTC