in reply to Re: Why do zero width assertions care about lookahead/behind?
in thread Why do zero width assertions care about lookahead/behind?
You are accurate that \b looks at both sides. I think that merlyn provided the perfect clarification; \b has alternation built into it. It either looks like (?<=\W)(?=\w) or like (?<=\w)(?=\W) depending on whether it's being used at the beginning or the end of a word.
So \b is not a simple lookahead or lookbehind assertion, it is a complex lookbehind/ahead in alternation with an opposing lookbehind/ahead.
Similar assertions could be custom written too. Say for example I wanted to create \x (my new metacharacter that means boundry between space and nonspace). Well, I can't name it \x, but I suppose I could name it $x. But whatever I call it, the definition would be: (?:(?<=\s)(?=\S))|(?:(?<=\S)(?=\s))
As far as what direction (?=...) looks, I didn't really want it to look both ways at once. I just thought it a little confusing that it could only look ahead. Without the prior benefit of merlyn's reply, thus not fully understanding the \b example, it seemed odd that (?=...) should be incapable of being used for lookbehind just as easily as lookahead. I understand that distinction now.
As for my other comment regarding the fact that (?<=...) must be fixed-width, I understand that as liz stated, it would be a backtracking nightmare if such were not the case, but still don't fully understand why that is so. I'll have to re-read the section in the Owls book about DFA engines and backtracking. Eventually it will sink in. ;)
Dave
|
|---|