in reply to Regex Question

Hi,

could this be what you want ?

# cat re1 use warnings; use strict; undef $/; my $tout = <DATA>; my $z; while ($tout =~ /((?:DES|TN).+?(DATE|ZONE)[^\n]+\n\s*\n)/sg) { print "\n\n", ++$z, ":\n", $1; } __DATA__ DES MAIL TN 001 0 02 00 TYPE SL1 CDEN DD CUST 0 KLS 1 FDN TGAR 0 LDN NO NCOS 0 09 DATE 9 MAR 2000 TN 001 0 02 01 05 RLS 06 TRN 07 AO3 08 09 ZONE 002 TN 001 0 02 01 05 RLS ZONE 001 07 AO3 08 09 DATE 9 MAR 2000
output
# perl -w ./re1 1: DES MAIL TN 001 0 02 00 TYPE SL1 CDEN DD CUST 0 KLS 1 FDN TGAR 0 LDN NO NCOS 0 09 DATE 9 MAR 2000 2: TN 001 0 02 01 05 RLS 06 TRN 07 AO3 08 09 ZONE 002 3: TN 001 0 02 01 05 RLS ZONE 001 07 AO3 08 09 DATE 9 MAR 2000

Replies are listed 'Best First'.
Re: Re: Regex Question
by set_uk (Pilgrim) on Jun 03, 2003 at 12:14 UTC
    That is exactly what I wanted. Thanks Not sure I understand why it works though. why does ->  [^\n] - match all the data after the DATE and ZONE to the end of line doesn't it just mean match a beginning of line and a new line?
      Hi,

      glad that i could help you.

      The [^\n] is a negated character class. Here it represents all characters which aren't "newlines". The newline-characters remained, though undefined $/, still in your string.

      The regex [^\n]+\n matches on 1-n non-newlines followed by a newline.

      After this there can be 0-n whitespaces \s* followed by another newline. This is your blank-line.

      greetings, tos