Re: Reg exp questions
by toolic (Bishop) on Nov 07, 2014 at 18:45 UTC
|
use warnings;
use strict;
my $str = 'a|b|c d e ';
$str =~ s/[\s|]/_/g;
print "$str\n";
$str = "w\nx\ny\nz";
$str =~ s/\n/ /g;
print "$str\n";
__END__
a_b_c_d____e__
w x y z
Here is an explanation of your 1st regex (Tip #9 from the Basic debugging checklist):
The regular expression:
(?-imsx:\s|\(|\))
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\( '('
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
\) ')'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
| [reply] [d/l] [select] |
|
|
$str =~ s/()[\s]/_/;
| [reply] [d/l] |
|
|
Show your exact input strings, and your expected output strings. http://sscce.org
| [reply] |
|
|
|
|
|
|
|
| [reply] [d/l] |
|
|
C:\>perl -E "my $str='abc()[]xyz'; $str =~ s/[()\[\]]{1}/\"/g; say $st
+r;
abc""""xyz
IOW, use a character class and escaping; both of which can be found in perldoc's perlre* pages. Note that the quantifier is redundant here, but could be useful were you trying to substitute two-at-a-time. OTOH, the /g modifier is NOT optional for this use case.
C:\>perl -E "my $str='a)bc([]xyz'; $str =~ s/[()\[\]]{2}/\"/g; say $st
+r;
a)bc"]xyz
| [reply] [d/l] [select] |
Re: Reg exp questions
by davido (Cardinal) on Nov 07, 2014 at 20:36 UTC
|
I'd like to clarify: You want to replace all space characters except for newline with underscore. You want to replace ( or ) with underscore. And you want to replace newline with space (0x20). Is that correct?
$str =~ s/[\h()]/_/g;
$str =~ s/\n/ /g;
One issue you are probably having right now is that \s includes \n, so your newlines are being transliterated to _ (underscore) before the second substitution operator is invoked. By the time you call the second one, there are no more newlines to transliterate. The \h metacharacter class includes horizontal whitespace, but not vertical (ie, not \n). \h is mentioned in perluniprops.
Another problem is that \s on the righthand side of a s/// operator is just a plain old string with an escaped s. The \ gets dropped by the "quote-like-operator" interpolation, so even if \n had matched, it would have left you with s instead of space.
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] |
|
|
First thing: You don't need [...] if there is only one element in your character class. \h alone is already a metasymbol that specifies a character class, so if all you want is horizontal whitespace, you could say:
s/\h/_/g;
You asked how to permit it to accept more than one adjacent whitespace, and how to replace any number of adjacent whitespaces with a single underscore:
s/\h+/_/g
If by \h you really just mean 0x20 (chr 32), you could use transliteration with "squash" instead (see perlop):
tr/ /_/s;
This will be a little more efficient than substitution, but tr/// is far less flexible, as it doesn't have regex semantics; there is no pattern, only a list of characters to be transliterated.
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] |
Re: Reg exp questions
by Anonymous Monk on Nov 08, 2014 at 07:59 UTC
|
When you want to replace just characters, use the tr operator (simpler and faster than s///):
my $str = "(foo + bar)\n(baz * quux)\n";
# opening and closing parens and space => underscore
$str =~ tr/() /_/;
# newline => space
$str =~ tr/\n/ /;
print "<<$str>>\n";
Output:
<<_foo_+_bar_ _baz_*_quux_ >>
| [reply] [d/l] [select] |