Thank you hbm for your great tips. Here is the code and output that I get for a sample record. My question was how can I eliminate the second phrase ("at the time") from the output since there is no mapping available for that phrase.

#!/usr/bin/perl use strict; use warnings; my $file = "regular.txt.out"; open my $fh, "<", $file or die "Unable to open $file: $!"; my($u_value, $p_value, $mc_value) = (undef) x 3; while (my $line=<$fh>) { chomp $line; if ($line=~/\n\n\n/){ ($u_value, $p_value, $mc_value) = (undef) x 3; print "\n"; } elsif ($line=~/\bProcessing\s/) { $line=~s/\bProcessing\s\d+\.tx\.\d+: //; $u_value = $line; print "\n$u_value\n"; undef $p_value; } elsif ($line=~/\bPhrase/) { $line=~s/\bPhrase: //; $line=~s/\"//g; if ($p_value) { print "\n" . ' ' x length $u_value;} $p_value=$line; print "\t$p_value"; undef $mc_value; } elsif ($line=~/\s\s/ ) { if ($mc_value) { print "\n\t" . ' ' x length $p_value;} $mc_value=$line; print "\t$mc_value"; } else { } } close $fh;

sample text for process

Processing 00000000.tx.3: Pulmonary embolism at the time of hip replacement

Phrase: "Pulmonary embolism"
Meta Mapping (1000)
(\s\s)1000 D0076131:PULMONARY EMBOLISM Disease or Syndrome

Phrase: "at the time"
Meta Candidates (0): <none>
Meta Mappings: <none>

Phrase: "of hip replacement"
Meta Mapping (1000)
(\s\s)1000 D0554893:HIP REPLACEMENT (STATUS POST HIP REPLACEMENT) Finding

output of the processd block is:

Pulmonary embolism at the time of hip replacement
<tab>Pulmonary embolism<tab>1000 D0076131:PULMONARY EMBOLISM Disease or Syndrome
<tab>at the time
<tab>of hip replacement<tab>1000 D0554893:HIP REPLACEMENT (STATUS POST HIP REPLACEMENT) Finding

In reply to Re^2: tab delimited extraction, formatting the output by zzgulu
in thread tab delimited extraction, formatting the output by zzgulu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.