srsahu75 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Friends, I am beginner in Perl and trying to find the problem in a script. Kindly help me to modify the script. My script is not giving the output for the last field and followed text (LA: Language English). Input file & script as follows:
Input file: Thu Mar 19 2:34:14 EDT 2009 STC Data query: Record 1 of 1 DN: Data Name Information Sciences TI: Title Distribution and abundance of bound volumes of the file in institute shelves from 1978 AU: Author Sahu, SR; Gawas, AKG AF: Affiliation Inst. of document scanning and documentation profiling at library, toward the prograssive society and establishment SO: Source Test document file. Vol. 5, no. 1, pp. 1-14. Jul 1990. DE: Descriptors Article Subject Terms: Abundance; Ecological distribution; Geographical distribution; Life cycle; Zooplankton; Article Taxonomic Terms: Euphausia; Article Geographic Terms: ISW, LA: Language English Perl Script: #!/usr/bin/perl -w my $found_f = 0; my ($lastline, $line); open (INPUTFILE, "test.txt"); while (<INPUTFILE>) { chomp; if (/^[A-Z]{2}:\s/) { $lastline = $line; $line = $_; $found_f = 1; } elsif (! /^[A-Z]{2}:\s/ && $found_f) { s/^ {4}/ /; $line .= $_; next; } elsif (/^$/) { $lastline = ' '; $found_f =0; } print "$lastline\n";
Kindly help me in modifying the script to proceed further

Replies are listed 'Best First'.
Re: Need help to make correction in a perl script
by ig (Vicar) on Mar 20, 2009 at 10:25 UTC

    I have used perltidy to improve the layout of your script, added a missing curly bracket at the end and added strict and warnings, which you should always do yourself, unless you have a very good reason not to. You can read more about these options in Use strict and warnings.

    Here is your modified code:

    #!/usr/bin/perl -w use strict; use warnings; my $found_f = 0; my ( $lastline, $line ); open( INPUTFILE, "test.txt" ); while (<INPUTFILE>) { chomp; if (/^[A-Z]{2}:\s/) { $lastline = $line; $line = $_; $found_f = 1; } elsif ( !/^[A-Z]{2}:\s/ && $found_f ) { s/^ {4}/ /; $line .= $_; next; } elsif (/^$/) { $lastline = ' '; $found_f = 0; } print "$lastline\n"; }

    To understand why the LA field is not being printed, you need to think about what happens in the last iteration through your loop. When the last line of your data is read, which branch of your if statement is taken? What happens in that branch? After reading the last line of the file, is your loop block executed again? If not, how can the last value of $lastline be printed?

      Thanks for the try and suggestions. I can understand that since the last input $_ is not starting with ^A-Z{2}:\s hence it is not taking the last field. I am not able to visualize, how to do that. I need the out put like this:
      DN: Data Name Aquatic Sciences TI: Title Distribution and abundance of bound volumes of the file in i +nstitute 1978 AU: Author Sahu, SR; Gawas, AKG AF: Affiliation Inst. of document scanning and documentation profiling + at library, toward the prograssive society and establishment SO: Source Test document file. Vol. 5, no. 1, pp. 1-14. Jul 1990. DE: Descriptors Article Subject Terms: Abundance; Ecological distribut +ion; Geographical distribution; Life cycle; Zooplankton; Article Taxo +nomic Terms: Euphausia; Article Geographic Terms: ISW, LA: Language English

        Like so many things, the solution is easy to understand once you know it but difficult to see when you don't know where to look.

        Your program reads the last two lines and copies them into your variable $line but it exits without ever printing the contents of this variable. All you have to do is print the contents of this variable after reading the last line of the file.

        Your loop terminates after reading the last line of the file. Therefore, to print the contents of $line after reading the last line of the file, you need to add a print statement after your loop.

        Your program is otherwise a bit more complex than it needs to be. Here is a simplified version, with a final print added, to print that troublesome last line.

        #!/usr/bin/perl -w use strict; use warnings; my $line; while (<DATA>) { chomp; if (/^$/) { print "\n"; $line = ""; } elsif (/^[A-Z]{2}:\s/) { print "$line\n" if($line); $line = $_; } else { $line .= " $_"; } } print "$line\n" if($line); __DATA__ DN: Data Name Information Sciences TI: Title Distribution and abundance of bound volumes of the file in institute shelves from 1978 AU: Author Sahu, SR; Gawas, AKG AF: Affiliation Inst. of document scanning and documentation profiling at library, toward the prograssive society and establishment SO: Source Test document file. Vol. 5, no. 1, pp. 1-14. Jul 1990. DE: Descriptors Article Subject Terms: Abundance; Ecological distribution; Geographical distribution; Life cycle; Zooplankton; Article Taxonomic Terms: Euphausia; Article Geographic Terms: ISW, LA: Language English
        What is not coming? Is there a blank line at the end of the file that is not printing or are you expecting a value of some kind to print at the end when the file has been output?
Re: Need help to make correction in a perl script
by cdarke (Prior) on Mar 20, 2009 at 10:09 UTC
    Here is mine:
    #!/usr/bin/perl use strict; use warnings; my $found_f = 0; my ($lastline, $line); open (INPUTFILE, "test.txt") or die "Unable to open test.txt: $!"; # 'or die' Added while (<INPUTFILE>) { chomp; if (/^[A-Z]{2}:\s/) { $lastline = $line; $line = $_; $found_f = 1; } elsif ($found_f) { s/^ {4}/ /; $line .= $_; } elsif (/^$/) { $lastline = ' '; $found_f =0; } } # Added $lastline .= $line; # Added - this was your main problem print "$lastline\n"; close (INPUTFILE); # Added
    Notice how easier it is to read when you indent if and while statements? No wonder you missed the final }.
    Update: Simplified second test (unnecessary RE) and removed next, which is also unnecessary.
      Thanks for the try but I require the output in this format. In my present script the lastline is not coming.
      DN: Data Name Aquatic Sciences TI: Title Distribution and abundance of bound volumes of the file in i +nstitute 1978 AU: Author Sahu, SR; Gawas, AKG AF: Affiliation Inst. of document scanning and documentation profiling + at library, toward the prograssive society and establishment SO: Source Test document file. Vol. 5, no. 1, pp. 1-14. Jul 1990. DE: Descriptors Article Subject Terms: Abundance; Ecological distribut +ion; Geographical distribution; Life cycle; Zooplankton; Article Taxo +nomic Terms: Euphausia; Article Geographic Terms: ISW, LA: Language English
Re: Need help to make correction in a perl script
by missingthepoint (Friar) on Mar 20, 2009 at 10:16 UTC

    How about this? (works for me with test data)

    use strict; use warnings; my %code_to_full; my %field_by_code; open my $f, '<', 'test.txt' or die "open: $!"; my $last_code; while (<$f>) { my ($code, $full) = /^([A-Z]{2}):\s+(.+)/i; if ($code && $full) { $code_to_full{$code} = $full; $field_by_code{$code} = ''; $last_code = $code; } else { unless ($last_code) { warn "No code and last_code blank; this shouldn't happen"; } $field_by_code{$last_code} .= $_; } } for (sort keys %code_to_full) { print "Code: $_\n"; print "Full: $code_to_full{$_}\n"; print "Field data: $field_by_code{$_}\n\n"; }

    Things I'd like to point out:

    • use of strict
    • lexical filehandles
    • error checking on system call (open)

    "Half of all adults in the United States say they have registered as an organ donor, although only some have purchased a motorcycle to show that they're really serious about it."
Re: Need help to make correction in a perl script
by bichonfrise74 (Vicar) on Mar 20, 2009 at 16:16 UTC
    Is this what you want?
    #!/usr/bin/perl use strict; while( <DATA> ) { if ( /^[A-Z]{2}:\s/ ... /\n\n/ ) { print; } } __DATA__ Thu Mar 19 2:34:14 EDT 2009 STC Data query: Record 1 of 1 DN: Data Name Information Sciences TI: Title Distribution and abundance of bound volumes of the file in institute shelves from 1978 AU: Author Sahu, SR; Gawas, AKG AF: Affiliation Inst. of document scanning and documentation profiling at library, toward the prograssive society and establishment SO: Source Test document file. Vol. 5, no. 1, pp. 1-14. Jul 1990. DE: Descriptors Article Subject Terms: Abundance; Ecological distribution; Geographical distribution; Life cycle; Zooplankton; Article Taxonomic Terms: Euphausia; Article Geographic Terms: ISW, LA: Language English