in reply to tab delimited extraction, formatting the output

First and foremost, you'll probably make your life easier in the long run if you open your code with use strict;use warnings - it'll catch a host of accidental mistakes. If you are working with tab-delimited fields, it's probably easier to use an existing parser like Text::CSV than rolling your own. Since the end-of-record markers are much rarer than tabs and new lines, you could catch that up front. And if you are concerned with formatting, it's probably easier to use sprintf to get things looking nice. Here's some basic code that (mostly) replicates what you've done, to help you with the CSV prototype:

#!/usr/bin/perl use strict; use warnings; use Text::CSV; my $file = "fielded.txt"; my $csv = Text::CSV->new({sep_char => "\t"}); # create a new objec +t open my $fh, "<", $file or die "Unable to open $file: $!"; while (my $data_ref = $csv->getline($fh)) { my @data = @{$data_ref}; if ($data[0] eq "'EOU'.") { # End of record code } elsif ($data[2] eq "u") { print "\n$data[3]" } elsif ($data[2] eq "p") { print "$data[3]\n" } else { #die "Unexpected line format encountered, $file, @data"; } } close $fh;

Update: I should point out you've made great strides since block extraction - congratulations.

Replies are listed 'Best First'.
Re^2: tab delimited extraction, formatting the output
by zzgulu (Novice) on Feb 09, 2009 at 21:21 UTC
    Thank you kenneth for your great comment and direction. Apprentaly I didn't have CSV package so I learned how to download and install packages too. I am still looking into your script to understand the logic. In the mean time, I added this line at the end but it messed up the output that I was quite happy with that.
    elsif ($data[2] eq "mc") { print join("\t",@data[7,8,9,10,11])

    how can I make the output of join to appear exctly in front of its related phrase (p). I also looked at sprintf link you sent, couldn't find anything relevant to what I want to do. Thanks again

      Note that in order to get the formatting, you need to cache the previous string in order to determine the indentation.

      #!/usr/bin/perl use strict; use warnings; use Text::CSV; #my $file = "fielded.txt"; my $csv = Text::CSV->new({sep_char => "\t"}); # create a new objec +t open my $fh, "<", $file or die "Unable to open $file: $!"; my($u_value, $p_value, $mc_value) = (undef) x 3; while (my $data_ref = $csv->getline($fh)) { my @data = @{$data_ref}; if ($data[0] eq "'EOU'.") { ($u_value, $p_value, $mc_value) = (undef) x 3; print "\n"; } elsif ($data[2] eq "u") { $u_value = $data[3]; print "\n$u_value"; undef $p_value; } elsif ($data[2] eq "p") { if ($p_value) { print "\n" . ' ' x length $u_value; } $p_value = $data[3]; print "\t$p_value"; undef $mc_value; } elsif ($data[2] eq "mc") { if ($mc_value) { print "\n" . ' ' x length $p_value; } $mc_value = join("\t",@data[7 .. 11]); print "\t$mc_value"; } else { #die "Unexpected line format encountered, $file, @data"; } } close $fh;

        Thank you very much Kenneth. I was looking for that word in perl language (cash) too. I read that "undef" tells perl to remove all formatting (like \n) indicators from the text. So, what does this line do?

        my($u_value, $p_value, $mc_value) = (undef) x 3

        thanks again for your time and great great help