Polyglot has asked for the wisdom of the Perl Monks concerning the following question:
Here are a couple such pieces of regex that I have attempted to assign to a variable:
And here's how those might look nested into the full regular expression (shortened for demo only, and based on my Thai module):#NO FORWARD O-ANG WITHOUT A THAI MUTE/CONSONANT ENDING my $nfw_oang = eval(qr% (?!\p{IsOang}(?! (?:\p{InThaiFinCons}){1,2} (?![\p{InThaiCompVowel}\p{InThaiPostVowel}\p{InThaiTone}]) (?:\p{InThaiMute}) ))%); #INITIAL CONSONANT(S) my $initialconsonant = eval(qr% (?: (?: (?:\p{InThaiDualC1}) (?:\p{InThaiDualC2}) ) | (?:\p{InThaiCons}) )%);
I've tried it with and without the "eval", and I have removed the in-line comments for the variable assignments "just in case." Still, the match seems not to work as expected. Either it does not succeed (no matches at all), as it did before I had replaced the code blocks with the variables for them, or it matches more than it was expected to--depending on whether I have added the "eval" or not.my $space = q':ThIsWiLlBeAsPaCe:'; my $syllables = $text =~ s/ ( #SORT SYLLABLES BY VOWEL LENGTH CATEGORY #MATCH LONGEST FIRST #------------------------------------------------ (?: #Compound four-character vowels (three of them) (?:\p{IsSarae}) #SARA-E PRE-VOWEL (?:\p{InThaiCons}){1,2} #CONSONANT(S) (?:\p{InThaiTone})? #OPTIONAL TONE MARK (DEP. ON TYPIN +G ORDER) (?:[\p{IsSaraii}\p{IsSarauee}]) #ONE OF THESE COMP. VOWELS (?:\p{InThaiTone})? #OPTIONAL TONE MARK (DEP. ON TYPIN +G ORDER) (?:[\p{IsOang}\p{IsYoyak}]). #ONE OF THESE (?:[\p{IsSaraa}\p{IsWowaen}]). #THE SHORTENING VOWEL -or- WO-WAEN ) #NOTE: The wo-waen version not + on standard vowel charts | #------------------------------------------------ (?: #Compound three-character vowels (six of them) (?: #With pre-vowel & comp. vowel, no shortening post-vowel (2 +) (?:\p{IsSarae}) #SARA-E PRE-VOWEL (?:\p{InThaiCons}){1,2} #CONSONANT(S) (?:\p{InThaiTone})? #OPTIONAL TONE MARK (DEP. ON TYP +ING ORDER) (?:[\p{IsSaraii}\p{IsSarauee}]) #ONE OF THESE COMP. VOWELS (?:\p{InThaiTone})? #OPTIONAL TONE MARK (DEP. ON TYP +ING ORDER) (?:[\p{IsOang}\p{IsYoyak}]) #ONE OF THESE (?:\p{InThaiFinCons}){0,3} #OPTIONAL SYLLABLE-ENDING CONSON +ANT(S) (?![\p{InThaiCompVowel}\p{InThaiPostVowel}\p{InThaiTone}]) #NOT + ONE OF THESE!!! ${nfw_oang} # <---- VARIABLE HERE !!! (?:\p{InThaiMute})? #OPTIONAL THAI MUTE CHARACTER (G +ARAN) ) ) # [ SNIP ] | #------------------------------------------------ (?: #Single-character vowels (eighteen of them) (?: #Pre-consonant "I" vowels (2) (?:[\p{IsSaraaimaimuan}\p{IsSaraaimaimalai}]) #"I" PRE-VOWEL ${initialconsonant} # <---- VARIABLE HERE !!! (?![\p{InThaiCompVowel}\p{InThaiPostVowel}]) #NOT ONE OF THESE!!! (?:\p{InThaiTone})? #OPTIONAL TONE MARK ) ) ) /$space.$1.$space/egx; $text =~ s!(?:$space)+! !g;
I had not expected this to be difficult. Now I'm puzzled as to what I might be doing wrong. Googling for answers did not enlighten me--the answers led me to believe it should be working as-is (but it isn't). I had the code working before attempting to replace the blocks with variables, so it seems this is the only variable (no pun intended) here.
Ideas are welcome, and thank you!
Blessings,
~Polyglot~
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Repeated code blocks in long and hairy regex
by Corion (Patriarch) on Nov 05, 2023 at 10:25 UTC | |
by Polyglot (Chaplain) on Nov 05, 2023 at 11:12 UTC | |
by Corion (Patriarch) on Nov 05, 2023 at 11:23 UTC | |
by Polyglot (Chaplain) on Nov 05, 2023 at 12:04 UTC | |
by Corion (Patriarch) on Nov 05, 2023 at 13:58 UTC | |
| |
|
Re: Repeated code blocks in long and hairy regex
by ikegami (Patriarch) on Nov 06, 2023 at 14:51 UTC | |
|
Re: Repeated code blocks in long and hairy regex
by NERDVANA (Priest) on Nov 06, 2023 at 19:00 UTC |