TrekNoid has asked for the wisdom of the Perl Monks concerning the following question:

I have a feeling that I'm overlooking something obvious, but I've been away from Perl for about a year, and I'm having trouble getting the code below to work.

I'm trying to read through a system log file, and pull out the users that have entered orders on a patient. The way to tell that, however, is a two-line process...

I need to find the line that has a segment beginning with 'UB' (user identifier) directly following a pipe... but only when it's following a line (and not necessarily the next line) that has 'MSG-HDR' in it.

Here's what I've got so far:

$orderfound = 0; # Flag used to keep track of Orders being found while (<>) { $line = $_; if ($line=~/MSG-HDR/) { $orderfound = 1; } if (($orderfound) && ($line=~/|UB/)) { print "$line \n"; $orderfound = 0; } }

And here's a file snippet... (names changed, obviously)

*-X1AA1 ENTERED* C 6 18 374,032906 P 090003 090003 23-027 D,0 + 043800 VORHEES, JASON *-X1AA1 ENTERED* C 6 18 374,032906 P 090003 090003 43-044 D,0 + 045800 MYERS, MICHAEL *-X1AA1 ENTERED* C 6 18 373,032906 P 090003 090003 21-032 D,0 + 047000 KREUGER, FREDDY 0* * * * * * * * * * * * * * * * * * * * * * * * * * 36-060 090003 + *MSG-HDR D8020DE9 00000056 143CA421 007C0000 0329A008 84090003 + <01 |U E9C160 ||UBZ 0000 VORHEES, JASON RPH SJP ||UD 000 +0. M8||QA# 0000 ||QA= 11FF ||QB= 1130 Q0V| <02 |QB= 00C8 QA3| + <03 * * * * * * * * * * * * * * * * * * * * * * * * * * + *-Q1AA1 ENTERED* C 6 18 371,032906 P 090003 090003 36-060 D,0 + 046800 VORHEES, JASON

What's hapening right now is that I'm printing out the line that has 'MSG-HDR' in it, but I need it to print the line with 'UBZ 0000 VORHEES, JASON' in it.

I'm sure I'm missing something obvious... I just don't see it...

Replies are listed 'Best First'.
Re: Matching over two lines
by ikegami (Patriarch) on Mar 30, 2006 at 17:00 UTC

    Three problems:

    • "|" is a special character in regexps. It needs to be escaped.

    • You will look for "MSG-HDR" and "|UB" on the same line. The following would match:

      MSG-HDR |UB

      The next in the code below fixes this.

    • $orderfound is not always cleared when it should be. The following would match:

      MSG-HDR bla |UB

      Or maybe that's ok? If it's ok, keep the $orderfound = 0; where it was.

    Fix:

    my $orderfound = 0; while (my $line = <>) { if ($line =~ /MSG-HDR/) { $orderfound = 1; next; } if ($orderfound && $line =~ /\|UB/) { print "$line\n"; } $orderfound = 0; }
      Escaping the pipe did the trick

      The reason I'm clearing $orderfound where I am is because the UB line could be 1 or more lines later than the MSG-HDR line, so I don't want $orderfound to clear until I've matched that line.

      Thanks much...