regex capture case

rmflow has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: regex capture case by moritz (Cardinal) on Jun 23, 2009 at 07:09 UTC
In Perl 6 there's the `:samecase` regex modifier for a similar purpose (though it applies to substitutions and not captures). I implemented that functionality in Perl6::Str. If you don't want to use the whole module, you can still get some inspiration on how to implement such a function.	[reply] [d/l]
Re: regex capture case by grizzley (Chaplain) on Jun 23, 2009 at 07:15 UTC
No, it isn't possible. $1 is a part of the text from $_ variable, so what's in $_, will be in $1.	[reply]
Re: regex capture case by chomzee (Beadle) on Jun 23, 2009 at 07:22 UTC
Can't you just modify "aAbc" into "Aabc" afterwards? You could use substr to cut out two first chars and then concatenate "Aa" with result of substr.	[reply]
Re^2: regex capture case by rmflow (Beadle) on Jun 23, 2009 at 08:53 UTC
The nature of regex is not known, this is just an example. To do what you say I will need to perform a full parse of regex and it will not be "simple solution". I mean, I will need to distinguish regexes like "Dd" and "D\d" etc...	[reply]
Re: regex capture case by Crian (Curate) on Jun 23, 2009 at 10:26 UTC
I don't know if it is the right solution for your problem, but did you thought about using ucfirst on the result?	[reply]
Re^2: regex capture case by rmflow (Beadle) on Jun 23, 2009 at 10:32 UTC
what if `$_ = "Aabc"; $someRegex = "aA.*";` [download] then I'll need to uc second char. $_ and $someRegex are not known.	[reply] [d/l]
Re^3: regex capture case by Porculus (Hermit) on Jun 23, 2009 at 15:04 UTC
What about `$someRegex = "(aA\|Aa).";` ...what effect should that have? Or what about `$someRegex = "(?!aa)[Aa]+."` Basically, what you're asking for is not feasible in the general case. If you know that `$someRegex` will always be very simple, it becomes tractable; you can use something like YAPE::Regex to parse the regex and match elements thereof to the extracted string to determine which characters' case needs to change. But for arbitrary regexes, it gets hard quickly. Depending on what you actually need this for, you may find it simpler to request both a regex and a case template, or to have your input take the form of a restricted pattern that you yourself can then translate into a regex and a case template.	[reply] [d/l] [select]
Re: regex capture case by Marshall (Canon) on Jun 23, 2009 at 12:32 UTC
Your match statement: /($regex)/i; means: match any of these: Aa,aA,aa,AA followed by zero or more characters. The /i means case insensitive. Also the "Aa." does not "anchor" the expression to the front, "Baababoo" would match also. Anchor the regex with the ^ character. "^Aa.". But it appears that to make this work, you should just delete the "i". I am assuming that you misspoke ($regex) should be ($someRegex). Of course it is possible that I've misunderstood your intent. Aa.* means 'A' then 'a' then anything which by definition "anything" is case insensitive.	[reply]
Re: regex capture case by dsheroh (Monsignor) on Jun 23, 2009 at 12:20 UTC
As already noted by grizzley, `$1` will contain the text from `$_` which matched the regex and the text will be in exactly the same form as it appeared in `$_`. You will need to first extract the matching text with the regex and then carry out any necessary alterations of the match. If you tell us the rules used to determine what alterations need to be made to `$1`, then we may be able to suggest the most efficient ways of accomplishing that, but the regex itself will not be able to do that for you.	[reply] [d/l] [select]
Re^2: regex capture case by rmflow (Beadle) on Jun 23, 2009 at 15:14 UTC
If you tell us the rules used to determine what alterations need to be made to $1, then we may be able to suggest the most efficient ways of accomplishing that The objective is to make some rules for formatting of certain texts, for example: `rule: PowerGenerator\d+ text: pOwerGeNERator53` [download] the text should be transformed to PowerGenerator53 other example: `rule: Data\d+Bus_[ABC]\d+ text: DATA5Bus_b3` [download] should be converted to Data5Bus_B3 If the text does not match to rule then no changes should be done.	[reply] [d/l] [select]
Re^3: regex capture case by johngg (Canon) on Jun 23, 2009 at 17:09 UTC
Given your rules, this code seems to do what you want but it uses a string eval, the use of which should be treated with caution. use strict; use warnings; my @phrases = ( q{Supply from pOwerGeNERator53 today.}, q{DATA5Bus_C3 routed via PoweRgeNerator71 to data17buS_a3}, q{The newPowerGenErATor6 will not change}, ); my %rules = ( q{(?i)\bpowergenerator(\d+)\b} => q{qq{PowerGenerator$1}}, q{(?i)\bdata(\d+)bus_([ABC])(\d+)\b} => q{qq{@{ [ qq{Data$1Bus_} . uc $2 . $3 ] }}}, ); foreach my $phrase ( @phrases ) { print qq{Original: $phrase\n}; my $newPhrase = $phrase; foreach my $rule ( keys %rules ) { $newPhrase =~ s{$rule}{ eval $rules{ $rule } }eg; } print qq{ Amended: $newPhrase\n\n}; } [download] The output. `Original: Supply from pOwerGeNERator53 today. Amended: Supply from PowerGenerator53 today. Original: DATA5Bus_C3 routed via PoweRgeNerator71 to data17buS_a3 Amended: Data5Bus_C3 routed via PowerGenerator71 to Data17Bus_A3 Original: The newPowerGenErATor6 will not change Amended: The newPowerGenErATor6 will not change` [download] I hope this is of interest. Cheers, JohnGG	[reply] [d/l] [select]
Re^4: regex capture case by rmflow (Beadle) on Jun 24, 2009 at 09:37 UTC