BioNrd has asked for the wisdom of the Perl Monks concerning the following question:

Monks,

I would like to loop through a file that is setup like this.

1\t300\t1_a\t30\t1_b\t20 1\t300\t2_a\t22\t2_b\t12 1\t320\t1_a\t10\t1_b\t30 1\t320\t2_a\t32\t2_b\t22 2\t312\t1_a\t30\t1_b\t20 2\t312\t2_a\t22\t2_b\t12 2\t310\t1_a\t31\t1_b\t20 2\t310\t2_a\t21\t2_b\t12
And have it output based on the following bit of information:
while (<FILE>) { #containing the above information chomp; my @temp_array = split(/\t/,$_); for (1 .. 2) { #arbitary number based on outside factor my $pop_tracker = $_; if ($temp_array[0] == $pop_tracker) { print $temp_array[0]; #I want this to only print once. #more print statements here to reflect the below output. } } # I want: 1 - 300 1_a 30 1_b 20 2_a 22 2_b 12 320 1_a 10 1_b 30 2_a 32 2_b 22 # 2 - you get the idea... #Not: 1 - 300 1_a 30 1_b 20 1 - 300 2_a 22 2_b 12 1 - you get the idea... 1 - some info 2 - some info 2 - some info 2 - some info 2 - some info
So the output I want is in the code above, I am just really struggling to get things to work as I would like.

Any ideas are a big help.

Bio.

---- Even a blind squirrel finds a nut sometimes.

Replies are listed 'Best First'.
Re: Loop, but print once.
by toolic (Bishop) on Mar 04, 2008 at 18:39 UTC
    One way to do it is to create state variables to remember previous lines. This gets you half the way there ... but, you get the idea.
    #!/usr/bin/env perl use warnings; use strict; my $prev_first = 0; my $prev_second = 0; # you need to make use of this too while (<DATA>) { chomp; my ($first, $second, @data) = split /\t/; if ($first == $prev_first) { print " $second @data\n"; } else { print "$first - $second @data\n"; $prev_first = $first; } } __DATA__ 1 300 1_a 30 1_b 20 1 300 2_a 22 2_b 12 1 320 1_a 10 1_b 30 1 320 2_a 32 2_b 22 2 312 1_a 30 1_b 20 2 312 2_a 22 2_b 12 2 310 1_a 31 1_b 20 2 310 2_a 21 2_b 12

    This prints:

    1 - 300 1_a 30 1_b 20 300 2_a 22 2_b 12 320 1_a 10 1_b 30 320 2_a 32 2_b 22 2 - 312 1_a 30 1_b 20 312 2_a 22 2_b 12 310 1_a 31 1_b 20 310 2_a 21 2_b 12

    I used actual tab characters in the __DATA__, not "\" followed by "t".

      Can we get the rest of the way there? Of course we can.

      Can we do it without array subscripts? Why, yes. We can always do it without array subscripts.

      my $SPACE = q{ }; my $PRINTF = '%3s %3s %s %s %s %s'; # remember cells from previous row my @prev_cells = ( $SPACE ) x 6; while (<DATA>) { # append a dash to the first column substr($_, 1, 0, ' -'); my @cells = split /\t/, $_; # suppress display if same as above my @display_cells = map { $_ eq shift @prev_cells ? $SPACE : $_ } @cells; printf $PRINTF, @display_cells; @prev_cells = @cells; } __DATA__ 1 300 1_a 30 1_b 20 1 300 2_a 22 2_b 12 1 320 1_a 10 1_b 30 1 320 2_a 32 2_b 22 2 312 1_a 30 1_b 20 2 312 2_a 22 2_b 12 2 310 1_a 31 1_b 20 2 310 2_a 21 2_b 12

      prints

      1 - 300 1_a 30 1_b 20 2_a 22 2_b 12 320 1_a 10 1_b 30 2_a 32 2_b 22 2 - 312 1_a 30 1_b 20 2_a 22 2_b 12 310 1_a 31 1_b 20 2_a 21 2_b 12
        I don't think you get the intended result for
        1 300 1_a 30 1_b 20 1 300 2_a 22 2_b 12 1 320 1_a 10 1_b 30 1 320 2_a 32 2_b 22 2 320 1_a 31 1_b 20 2 320 2_a 21 2_b 12

        It gives
        2 -     1_a 31 1_b 20
        where I'd expect
        2 - 320 1_a 31 1_b 20

        Also, there's usually a desire to only omit repeated header fields, yet your solution can omit any number of fields, even all of them.

        Fix:

        use strict; use warnings; my $PRINTF = '%3s %3s %s %s %s %s'; my $HEADERS = 2; my @prev_cells; while (<DATA>) { # Input data. my @cells = split /\t/, $_; # Format the data for printing. $cells[0] .= ' -'; # Remove redundant headers. my @display_cells = @cells; if (@prev_cells) { for (@display_cells[ 0 .. $HEADERS-1 ]) { last if $_ ne shift @prev_cells; $_ = ''; } } @prev_cells = @cells; # Output data. printf $PRINTF, @display_cells; } __DATA__ 1 300 1_a 30 1_b 20 1 300 2_a 22 2_b 12 1 320 1_a 10 1_b 30 1 320 2_a 32 2_b 22 2 320 1_a 31 1_b 20 2 320 2_a 21 2_b 12 2 320 2_a 33 2_b 44
        1 - 300 1_a 30 1_b 20 2_a 22 2_b 12 320 1_a 10 1_b 30 2_a 32 2_b 22 2 - 320 1_a 31 1_b 20 2_a 21 2_b 12 2_a 33 2_b 44
Re: Loop, but print once.
by jwkrahn (Abbot) on Mar 04, 2008 at 19:03 UTC

    This should get you started:

    $ perl -e' $x = <<STUFF; 1\t300\t1_a\t30\t1_b\t20 1\t300\t2_a\t22\t2_b\t12 1\t320\t1_a\t10\t1_b\t30 1\t320\t2_a\t32\t2_b\t22 2\t312\t1_a\t30\t1_b\t20 2\t312\t2_a\t22\t2_b\t12 2\t310\t1_a\t31\t1_b\t20 2\t310\t2_a\t21\t2_b\t12 STUFF open my $fh, "<", \$x or die $!; my @last; while ( <$fh> ) { my @fields = split /\t/; for my $i ( 0 .. $#fields ) { if ( $fields[ $i ] eq $last[ $i ] ) { $fields[ $i ] = ""; } else { last; } } print join "\t", @fields; @last = split /\t/; } ' 1 300 1_a 30 1_b 20 2_a 22 2_b 12 320 1_a 10 1_b 30 2_a 32 2_b 22 2 312 1_a 30 1_b 20 2_a 22 2_b 12 310 1_a 31 1_b 20 2_a 21 2_b 12
Re: Loop, but print once.
by Sinistral (Monsignor) on Mar 04, 2008 at 18:46 UTC

    What you're wanting to do is report generation, and this is old school COBOL type processing at its best. Fortunately, it's a common idiom that can be put into any procedural language, including (especially?) Perl.

    You need two variables to keep up with the 1st column (the 1, 2, ...) and the 2nd column (the 300, 320, 312, ...).

    $old_col_0 = ''; $old_col_1 = ''; while (<FILE>) { #containing the above information chomp; my @temp_array = split(/\t/,$_); for (1 .. 2) { #arbitary number based on outside factor my $out_line = ''; if ($temp_array[0] ne $old_col_0) { $out_line = $temp_array[0]; # Space after data $out_line .= " "; $old_col_0 = $temp_array[0]; } else { # 1 space where data would have been, plus 1 more for # column spacing $out_line = " "; } if ($temp_array[1] ne $old_col_1) { # Make sure all subsequent uses of out_line are # append, not assignment $out_line .= $temp_array[1]; $out_line .= " "; $old_col_1 = $temp_array[1]; } else { # Four spaces, 3 for data, 1 for column spacing $out_line .= " "; } # A handy join to get the rest of the data # We don't want the 0th or 1st elements, they're # handled specially. $out_line .= join(" ", @temp_array[2..5]); print $out_line;