If you specifically want to replace just the hyphen with a space, only in the case where it occurs between G and Y, and you want to leave those letters unchanged, you want to read the section of perlre that describes "look ahead" and "look behind" assertions.
(I use them rather a lot, and it has taken a while for me to remember the syntax for them -- I've done "perldoc perlre" many times to check on it.)
$_ = "...blah-blah*G-Y*blah-blah... here is another G-Y and-so-on";
print "BEFORE: $_\n";
s/(?<=G)-(?=Y)/ /g;
print " AFTER: $_\n";
| [reply] [d/l] |
$second_str =~ s/G-Y/G Y/;
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] |
As well as knowing other substitutions that work it may help you to understand why yours fails.
Your pattern '-[^\s-~]' matches two characters: the literal '-' matches one and the character class matches another. In your string, the first pair of characters that matches the expression is '-Y'. Thus '-Y' is removed from your string and replaced with ' '.
update: rewrote to clarify.
| [reply] |
Or you could do something like this?
#!/usr/bin/perl
use strict;
my $second_str = q{591[\s-~]*G-Y[\s-~]*6};
$second_str =~ s/(\w)-(\w)/$1 $2/g;
print "\nSecond Strip===========$second_str======";
| [reply] [d/l] |
Thanks a lot everyone. G and Y in the example could be any chars. I have a question for ig. In your post you replied that the literal '-' in -[^\s-~] represents one char and the char set represented another char and that is the reason why two chars was replaced. But the set contains a ^ symbol which tells system not to consider the chars given in the set. So could you let me know exactly what happens in the above stmnt.
Thanks. | [reply] [d/l] |
The ^ regex metacharacter at the very beginning of a character set or class complements the characters defined for the set.
The character class regex [^\s-~] matches any character that is not a whitespace character, a literal '-' (dash or hyphen) or a literal '~' (tilde). A 'Y' is not any of those characters, so it matches.
BTW — the character set [^\s-~] is better written as [^-\s~] so that the '-' character is not misunderstood (by the programmer or maintainer, not by Perl) as a character set range metacharacter. I'm guessing you are not running with warnings (and strictures), otherwise Perl would have complained about this:
>perl -wMstrict -le "my $rx = qr{ [^\s-~] }xms; print $rx;"
False [] range "\s-" in regex; marked by <-- HERE
in m/ [^\s- <-- HERE ~] / at -e line 1.
(?msx-i: [^\s-~] )
| [reply] [d/l] [select] |
You said:
G and Y in the example could be any chars.
Well, in case your code is still not doing exactly what you want, and you're still wondering how to fix it, maybe it's time to provide some of the real examples that you are actually dealing with.
I have to say that the example string in the OP is probably not what you meant it to be. In this statement:
$second_str = "591[\s-~]*G-Y[\s-~]*6";
the value assigned to "$second_str" is actually:
591[s-~]*G-Y[s-~]*6
Do you really have literal strings in your data that look like that? If you really do, is there some special rule about hyphens that are inside square brackets (don't replace with space) vs. hyphens that are not inside square brackets (replace these with space)?
If it's just a matter of replacing all hyphens with spaces, then this will do it more efficiently:
tr/-/ /;
But if you have contextual constraints (e.g. must be adjacent to letters/alphanumerics, and/or must not be within bracketed strings, or whatever), you haven't really given us any clear idea about that yet. | [reply] [d/l] [select] |