negativ lookahead did not work as expected. let's focus on:
("gugus" =~ /gu(?!gu)/) ==> <gugus> found ("gu", "gu", "s")
and
("gugis" =~ /gu(?!gi)/) ==> <gugis> nomatch (undef, undef, undef)
do not solve the same.

Look at it this way:

  1. "gugus" =~ /gu(?!gu)/

    1. Where does /gu/ match in "gugus"?
      "gugus" "gugus" ^^ here and ^^ here
    2. What is the string following the match?
      "gugus" "gugus" ^^ "gus" and ^^ "s"
    3. Does that string match /^gu/ (negative lookahead)?
      "gugus" "gugus" ^^ "gus" yes! ^^ "s" no! => match fails, keep looking => match succeeds
    4. Overall, the match succeeds at "gugus"

      <update2> I realized the above might be a little misleading: The regex engine always works from left to right, so it does not execute the steps in the order I've shown here. It first fully inspects the leftmost match before continuing on to find the second "gu" in the string. </update2>

  2. "gugis" =~ /gu(?!gi)/

    1. Where does /gu/ match in "gugis"?
      "gugis" ^^ here
    2. What is the string following the match?
      "gugis" ^^ "gis"
    3. Does that string match /^gi/ (negative lookahead)?
      "gugis" ^^ "gis" yes! => match fails
    4. Overall, the match fails.
  3. "start" =~ /(?!start)/

    1. Where does // match in "start"? (whitespace added to show the "zero-length strings" around each character, as in ""."s".""."t".""."a".""."r".""."t"."")
      v v v v v v (everywhere) " s t a r t "
    2. What is the string following the match?
      "start" | "tart" | | "art" | | | "rt" | | | | "t" | | | | | "" v v v v v v " s t a r t "
    3. Does that string match /^start/ (negative lookahead)?
      "start" yes => fails | "tart" no => succeeds | | "art" no => succeeds | | | "rt" no => succeeds | | | | "t" no => succeeds | | | | | "" no => succeeds v v v v v v " s t a r t "
    4. Since the regex engine goes from left to right, and stops at the first place where it succeeds, overall, the match succeeds at " s t a r t ".
    5. Side note: If you make the regex engine continue matching where it last left off with /g, you can see the whole thing in action:

      $ perl -wMstrict -MData::Dump while ( "start" =~ /(?!start)/pg ) { dd ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH}; } __END__ ("s", "", "tart") ("st", "", "art") ("sta", "", "rt") ("star", "", "t") ("start", "", "")

If you could give some more complete examples of actual things you're trying to match and the actual regexes, that would probably help.

Minor edits for clarification.


In reply to Re^3: regex negativ lookahead by haukex
in thread regex negativ lookahead by SwissGeorge

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.