Now you've lost me. You earlier said that you wanted your regexp to match the literal string '*test*'. And you provided a regexp with /\b\*test\*\b/, thus spelling out the absolute need for an asterisk to preceed and follow the word test, in order for the match to occur.

But now you've said that you want to match both '*test*' and '--*test--'. What made you think that '--*test--' would match against a regexp that specifies '\*test\*'?

Also, \b is a zero width assertion that specifies that there must be a word boundry at that particular position. A word boundry is the point where 'word characters' and 'non-word-characters' meet. There is no word character on either side of ' *test* ' at the position your original regexp place boundry assertions, and that's why your regexp fails. You went looking for a word boundry at the junction between a space character and an asterisk, in your original question. That's not a word boundry. A word boundry is, again, a "zero width assertion". '*' is not part of a word boundry. '*' is, if next to a word character, the non-word character that creates a word boundry in the zero-width space between the word character and the asterisk. But word boundries don't have a part; they don't consume a character. \b doesn't suck anything in.

Perhaps what you are saying is that you want 'test' to match as long as it is surrounded by a word boundry. That's easy. However, the following example will also match at the beginning of the string even if nothing comes before it, because the beginning of the string can be a word boundry too:

$string = '--*test--'; if ( $string =~ /\btest\b/ ) { print "$string matched.\n" }

If you want to match both the word test, and the actual non-word characters, which themselves are required to be there, that preceed and follow it, that's also easy:

my $string = "--*test--"; if ( $string =~ /\W+test\W+/ ) .....

Here there's really no need for the \b, because a word boundry is implicit in the fact that you've said that one or more non-word characters must preceed and follow the word 'test'.

I'm still a little foggy on what you're saying in your followup question; it redefines the problem to a degree, and actually has unresolvable conflicts within its own assertions.

I really think that you would benefit by having a look at the appropriate perldocs: perlrequick, perlretut, perlre, and the FAQ on Regular Expressions, perlfaq6. If you have Perl, you have those documents. I know it looks like a lot of reading, but the time I've taken in trying to compose a consciencious answer to your question is about equal to the time it would take you to read a couple of those documents in their entirety yourself. You can appreciate my frustration when after putting together a thorough and complete answer yesterday, your followup question today changes everything, and is still ambiguous, conflicted, and vague. Why did I bother in the first place if you're not going to do a little homework yourself?

Dave

"If I had my life to do over again, I'd be a plumber." -- Albert Einstein


In reply to Re: Re: Re: regex question by davido
in thread regex question by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.