in reply to Defining Characters in Word Boundary?

 

is it possible to define the characters that '\b' matches? I am processing latex code, and their macro character space is \ a-zA-Z \. I would like to write

  

\\$keyword\b

Unless I'm understanding your intentions wrong, that's the purpose of \b.

  Example: m/\bChi_2\b/

I don't think the underscore will cause you problems within the defined \b \b but I'm sure I'll be corrected shortly if I'm wrong. :)

Replies are listed 'Best First'.
Re^2: Defining Characters in Word Boundary?
by ikegami (Patriarch) on Jan 19, 2011 at 23:17 UTC
    I believe he's saying that "_2" isn't part of the macro, so $keyword = 'Chi'; '...\\Chi_2...' =~ /\\$keyword\b/ should match.

      Ugh! — Apologies to the OP. I misread that.

      "...the adversities born of well-placed thoughts should be considered mercies rather than misfortunes." — Don Quixote
Re^2: Defining Characters in Word Boundary?
by ikegami (Patriarch) on Jan 21, 2011 at 01:42 UTC

    I'm not sure why you added the update. I don't see what it adds, and it's not true. /\bChi_2\b/ will match plenty of strings.

    'Chi_2' =~ /\bChi_2\b/ # Match '!Chi_2!' =~ /\bChi_2\b/ # Match

    Maybe you had a specific string in mind, but I don't see how this relates to the OP. He would not use Chi_2 in the regex pattern.

    In a world where an identifier matches /^\w+\z/, you might do something like

    ($_ = '\\Chi+3' ) =~ s/\\Ch\b/$ch/g; # Won't replace ($_ = '\\Chi+3' ) =~ s/\\Chi\b/$chi/g; # Will replace ($_ = '\\Chi_2+3') =~ s/\\Chi\b/$chi/g; # Won't replace

    But what if identifiers match /^[a-zA-Z]\z/? You'd want the following behaviour:

    ($_ = '\\Chi+3' ) =~ s/\\Ch???/$ch/g; # Won't replace ($_ = '\\Chi+3' ) =~ s/\\Chi???/$chi/g; # Will replace ($_ = '\\Chi_2+3') =~ s/\\Chi???/$chi/g; # Will replace

    That's the OP's question.

    As I've already mentioned, I recommend extracting the identifier, then checking if it's one of interest. This can be as simple as the following:

    /\\([a-zA-Z]+)/ exists($vars{$1}) ? $vars{$1} : "\\$1" /eg

    The technique scales well, and it avoids the problem of matching something you've previously replaced.

         

      "Maybe you had a specific string in mind, but I don't see how this relates to the OP. He would not use Chi_2 in the regex pattern.

      I did have a very similar string in mind.    

      In a world where an identifier matches /^\w+\z/, you might do something like"
         ($_ = '\\Chi_2+3') =~ s/\\Chi\b/$chi/g; # Won't replace

      I understand my update isn't contributing to the OP's original question. I'm not trying to distract from his post or the thread, simply attempting to correct what I said regarding the underscore having no effect on the RegEx's success (again the one I had in mind).

      In my original reply I was referring to matching 'Chi' within 'Chi_2' using \b. I previously said that I didn't think the underscore would be a problem. However after some help in the CB from erix and Tanktalus it was shown that an underscore would interfere with this particular match:

       say (("Chi_2" =~ /\bChi\b/) ? "match" : "no match");
       returns: "no match"

        * Thanks again to Tantalus for this control structure.

      Again, apologies for any confusion caused.


      "...the adversities born of well-placed thoughts should be considered mercies rather than misfortunes." — Don Quixote

        In my original reply I was referring to matching 'Chi' within 'Chi_2' using \b

        So you meant /\bChi\b/ wouldn't match. That makes more sense.