Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

If I have the following data:
(1.3.56.84 56.38.m.26) (56.2.3.59)
How would I remove only the first '(' or last ')' if a pair isn't seen: i.e.
(1.3.56.84 => 1.3.56.84 56.38.m.26) => 56.38.m.26 (56.2.3.59) => (56.2.3.59)
Any ideas?

Replies are listed 'Best First'.
Re: Removing only the first instance with RegEx
by Abigail-II (Bishop) on Nov 06, 2003 at 14:14 UTC
    #!/usr/bin/perl use strict; use warnings; my $bal; $bal = qr /[^()]* (?: [(] (??{ $bal }) [)] )* [^()]*/x; while (<DATA>) { chomp; print $& while /$bal/g; print "\n"; } __DATA__ (1.3.56.84 56.38.m.26) (56.2.3.59) ((one)(two)) ((one(two)) ((one)(two) (one)(two)) ((one)two))

    Running this gives:

    1.3.56.84 56.38.m.26 (56.2.3.59) ((one)(two)) (one(two)) (one)(two) (one)(two) ((one)two)

    Abigail

      For the most part I understand what it's doing, but could you explain the '*' in the RegEx?

      Thanks

        For the most part I understand what it's doing, but could you explain the '*' in the RegEx?

        Fascinating. You understand compiled regular expressions, delayed regular expressions, $&, but not '*', one of the most used special tokens in a regular expression?

        Anyway, as 'man perlre' could have told you, '*' means that the thing in front of it can be matched zero or more times.

        Abigail

Re: Removing only the first instance with RegEx
by Roger (Parson) on Nov 06, 2003 at 14:25 UTC
    You could use the plain old substr and length functions instead of Regular Expressions.

    use strict; while (<DATA>) { chomp; my ($c1, $c2) = (substr($_,0,1), substr($_,length($_)-1,1)); if ($c1 == '(' && $c2 != ')') { substr($_,0,1) = undef; } elsif ($c1 != '(' && $c2 == ')') { substr($_,length($_)-1,1) = undef; } print "$_\n"; } __DATA__ (1.3.56.84 56.38.m.26) (56.2.3.59)
    And the output is
    1.3.56.84 56.38.m.26 (56.2.3.59)
Re: Removing only the first instance with RegEx
by Roy Johnson (Monsignor) on Nov 06, 2003 at 18:49 UTC
    s/\(// unless /\)/; s/\)// unless /\(/;
    or
    if (! /\)/) { s/\(// } elsif (! /\(/) { s/\)// }
    since at most only one of the substitutions is needed.

    These don't ensure balanced parens, as Abigail's answer does, but that wasn't your question. In each case, the first occurrence of a parenthesis will be removed, if there is no opposite parenthesis found. That means, for example, that
    )foo(
    would be left alone, as would
    (a b) c) d)
    or any other line that includes both types of parentheses.

      Roy,

      How would I combine the the following expressions to remove the word unless:

      #!/usr/bin/perl use strict; use warnings; while (<DATA>) { $_ =~ s/\(// unless /(?=\))/; $_ =~ s/\)// unless /(?<=\()/; print $_, "\n"; } __DATA__ (1.3.56.84) 56.38.m.26) (56.2.3.59 56.2.3.(59)
      Thanks!
Re: Removing only the first instance with RegEx
by sweetblood (Prior) on Nov 06, 2003 at 14:14 UTC
    You could just do something like this:

    if (!/^\(.*\)$/) {s/^\(//;s/\)$//};

    Enjoy

Re: Removing only the first instance with RegEx
by Anonymous Monk on Nov 06, 2003 at 14:05 UTC
    I'm not sure what you mean. Can you elaborate on the question and maybe post a code snippet?