in reply to Regex: Ignore \n in \S

You don't show the code you use, so it's hard to say how your regular expression "returns" something.

Maybe you want to use ([^\n]+) instead of the (\S+)? But \n would be in the \s class, so your \S shouldn't capture a \n. Please show the actual code you use.

Replies are listed 'Best First'.
Re^2: Regex: Ignore \n in \S
by manav_gupta (Acolyte) on Sep 12, 2010 at 20:45 UTC
    Apologies. I meant to convey that $1 would "return" "\n" as well, instead of the non-spaces before it.
    The regex I'd be using is DIP\s+\S+\s+(\S+).*?\\n

      Ah. Now I understand. You have a literal "\n", not a newline in your data, but you don't want to capture that. The easiest way is to restrict the character class to not include the backslash:

      for (<DATA>) { print "Matched: $1\n" if /DIP\s+\S+\s+([^\\\s]+).*?\\n/; }; __DATA__ *** ALARM 009 A2/APT \"KEN5-132/019/00\" 100831 1511 \nSWITCHIN +G NETWORK TERMINAL FAULT\n\nSNT TCASE STATE FCODE +SUBSNT INFO DIP amad theoneiwant\n RTDMA-63 4 +BLOC 38\n\nEXTERNAL EQUIPMENT FAILURE\n\nEXTP MG\n +2-2-010109 MEAPH\nEND *** ALARM 009 A2/APT \"KEN5-132/019/00\" 100831 1511 \nSWITCHIN +G NETWORK TERMINAL FAULT\n\nSNT TCASE STATE FCODE +SUBSNT INFO DIP amad theoneiwant \n RTDMA-63 4 + BLOC 38\n\nEXTERNAL EQUIPMENT FAILURE\n\nEXTP + MG\n2-2-010109 MEAPH\nEND *** ALARM 009 A2/APT \"KEN5-132/019/00\" 100831 1511 \nSWITCHIN +G NETWORK TERMINAL FAULT\n\nSNT TCASE STATE FCODE +SUBSNT INFO DIP amad theoneiwant hello world \n RTDMA-63 + 4 BLOC 38\n\nEXTERNAL EQUIPMENT FAILURE\n\nEXTP + MG\n2-2-010109 MEAPH\nEND *** ALARM 009 A2/APT \"KEN5-132/019/00\" 100831 1511 \nSWITCHIN +G NETWORK TERMINAL FAULT\n\nSNT TCASE STATE FCODE +SUBSNT INFO DIP amad theoneiwant\n RTDMA-63 4 +BLOC 38\n\nEXTERNAL EQUIPMENT FAILURE\n\nEXTP MG\n +2-2-010109 MEAPH\nEND

      The character class \S will not match any whitespace, but the character sequence "\n" (that is, backslash, followed by "n") is not whitespace.

        A better way would be to decode the lines, i.e. convert the "\n" into newlines.
        while (<DATA>) { s/\\n/\n/g; print "Matched: $1\n" if /DIP\s+\S+\s+(\S+).*?\n/; }
        Holy cow. Thank you. I've spent over 3 hours trying to tackel it! Thank you!