in reply to Reg exp questions

I'd like to clarify: You want to replace all space characters except for newline with underscore. You want to replace ( or ) with underscore. And you want to replace newline with space (0x20). Is that correct?

$str =~ s/[\h()]/_/g; $str =~ s/\n/ /g;

One issue you are probably having right now is that \s includes \n, so your newlines are being transliterated to _ (underscore) before the second substitution operator is invoked. By the time you call the second one, there are no more newlines to transliterate. The \h metacharacter class includes horizontal whitespace, but not vertical (ie, not \n). \h is mentioned in perluniprops.

Another problem is that \s on the righthand side of a s/// operator is just a plain old string with an escaped s. The \ gets dropped by the "quote-like-operator" interpolation, so even if \n had matched, it would have left you with s instead of space.


Dave

Replies are listed 'Best First'.
Re^2: Reg exp questions
by carolw (Sexton) on Nov 09, 2014 at 12:36 UTC

    if $str =~ s/[\h]/_/g; replaces all white spaces by _ and if there are more than 1 adjacent white space, how to replace all of adjacent white spaces by one _ instead of _ times number of adjacent white spaces?

      First thing: You don't need [...] if there is only one element in your character class. \h alone is already a metasymbol that specifies a character class, so if all you want is horizontal whitespace, you could say:

      s/\h/_/g;

      You asked how to permit it to accept more than one adjacent whitespace, and how to replace any number of adjacent whitespaces with a single underscore:

      s/\h+/_/g

      If by \h you really just mean 0x20 (chr 32), you could use transliteration with "squash" instead (see perlop):

      tr/ /_/s;

      This will be a little more efficient than substitution, but tr/// is far less flexible, as it doesn't have regex semantics; there is no pattern, only a list of characters to be transliterated.


      Dave