carolw has asked for the wisdom of the Perl Monks concerning the following question:

It might be a primitve question but how to replace all parentheses and espace by _?

$str =~ s/\s|\(|\)/_/g;

and to replace the \n by space, this doesn't work

$str =~ s/\n/\s/;

Replies are listed 'Best First'.
Re: Reg exp questions
by toolic (Bishop) on Nov 07, 2014 at 18:45 UTC
    UPDATE: I originally misread the question... I thought the OP wanted to replace |, but the OP really wants to replace ().

    \s should not be used in the REPLACEMENT part of s///

    use warnings; use strict; my $str = 'a|b|c d e '; $str =~ s/[\s|]/_/g; print "$str\n"; $str = "w\nx\ny\nz"; $str =~ s/\n/ /g; print "$str\n"; __END__ a_b_c_d____e__ w x y z

    Here is an explanation of your 1st regex (Tip #9 from the Basic debugging checklist):

    The regular expression: (?-imsx:\s|\(|\)) matches as follows: NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \s whitespace (\n, \r, \t, \f, and " ") ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \( '(' ---------------------------------------------------------------------- | OR ---------------------------------------------------------------------- \) ')' ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

      and if I have one space before the end of the $str and one (, one ), this doesn't work with or without / for ()

      $str =~ s/()[\s]/_/;
        Show your exact input strings, and your expected output strings. http://sscce.org

      How to combine the expressions? for ex, if I want to replace ()[] by the same char, how to can i combine it?

      $str =~ s/()[]/"/g;

        Quoting is in M$Win style:

        C:\>perl -E "my $str='abc()[]xyz'; $str =~ s/[()\[\]]{1}/\"/g; say $st +r; abc""""xyz

        IOW, use a character class and escaping; both of which can be found in perldoc's perlre* pages. Note that the quantifier is redundant here, but could be useful were you trying to substitute two-at-a-time. OTOH, the /g modifier is NOT optional for this use case.

        C:\>perl -E "my $str='a)bc([]xyz'; $str =~ s/[()\[\]]{2}/\"/g; say $st +r; a)bc"]xyz

        ++$anecdote ne $data


Re: Reg exp questions
by davido (Cardinal) on Nov 07, 2014 at 20:36 UTC

    I'd like to clarify: You want to replace all space characters except for newline with underscore. You want to replace ( or ) with underscore. And you want to replace newline with space (0x20). Is that correct?

    $str =~ s/[\h()]/_/g; $str =~ s/\n/ /g;

    One issue you are probably having right now is that \s includes \n, so your newlines are being transliterated to _ (underscore) before the second substitution operator is invoked. By the time you call the second one, there are no more newlines to transliterate. The \h metacharacter class includes horizontal whitespace, but not vertical (ie, not \n). \h is mentioned in perluniprops.

    Another problem is that \s on the righthand side of a s/// operator is just a plain old string with an escaped s. The \ gets dropped by the "quote-like-operator" interpolation, so even if \n had matched, it would have left you with s instead of space.


    Dave

      if $str =~ s/[\h]/_/g; replaces all white spaces by _ and if there are more than 1 adjacent white space, how to replace all of adjacent white spaces by one _ instead of _ times number of adjacent white spaces?

        First thing: You don't need [...] if there is only one element in your character class. \h alone is already a metasymbol that specifies a character class, so if all you want is horizontal whitespace, you could say:

        s/\h/_/g;

        You asked how to permit it to accept more than one adjacent whitespace, and how to replace any number of adjacent whitespaces with a single underscore:

        s/\h+/_/g

        If by \h you really just mean 0x20 (chr 32), you could use transliteration with "squash" instead (see perlop):

        tr/ /_/s;

        This will be a little more efficient than substitution, but tr/// is far less flexible, as it doesn't have regex semantics; there is no pattern, only a list of characters to be transliterated.


        Dave

Re: Reg exp questions
by Anonymous Monk on Nov 08, 2014 at 07:59 UTC
    When you want to replace just characters, use the tr operator (simpler and faster than s///):
    my $str = "(foo + bar)\n(baz * quux)\n"; # opening and closing parens and space => underscore $str =~ tr/() /_/; # newline => space $str =~ tr/\n/ /; print "<<$str>>\n";
    Output:
    <<_foo_+_bar_ _baz_*_quux_ >>