Oh Wise Monks,

I have inherited a pipeline process and have found cause to alter it. I am attempting to append to a file that is the output of one step in the pipeline before passing it on to the next step. Sounds easy enough.

However, this component is written in C++ (so I cannot alter it) and each time it dumps its output it adds a header line. So I am getting header line half way through my files. Like so:

% cat output.txt Chr Coord chip_id subarray_id gc NA15510 1 475 5730 5730_1 0.6 0.266 1 505 5730 5730_1 0.63333333 0.422 ... ... ... ... ... ... Chr Coord chip_id subarray_id gc NA15510 ... ... ... ... ... ... 1 925 5730 5730_1 0.70666666 0.071 1 960 5730 5730_1 0.70333333 0.036

Here is a summary of the pipeline code ... with the offending line (me thinks) highlighted:

% cat t.pl #!/usr/bin/perl -w use strict; my $filename = 'output.txt'; my %chr_arms = ( 1 => [ 1, 1000, 2000, 5000 ], 2 => [ 1, 1000 ], ); for my $chromosome ( sort { $a <=> $b } keys %chr_arms ) { if ( scalar @{ $chr_arms{$chromosome} } >= 2 ) { my $arm_start = shift @{ $chr_arms{$chromosome} }; my $arm_end = shift @{ $chr_arms{$chromosome} }; print qq{Running Pipeline for $chromosome:$arm_start-$arm_end\ +n}; # a long running process (written in C++) print qq{ask_bigdb [options] >> $filename\n}; ### OFFENDING LI +NE!!! redo; # do the next arm of chromosome ... } # check output file is not empty if ( is_too_short($filename) ) { print qq{$filename is empty\n}; next; } # a few more loooooooong running processes (written in C++ and Jav +a) print qq{GC Normalize ...\n}; print qq{Table merge ...\n}; print qq{Median Normalize ...\n}; print qq{Table merge ...\n}; print qq{Wave Normalize ...\n}; print qq{Table merge ...\n\n}; } exit 0; # Name : is_too_short # Purpose : check if the file has more that one line in it sub is_too_short { my $file = shift; # ensure the file exists if ( !-e $file ) { die qq{$file does not exist}; } # now check its head-line-count if ( my $head = qx{head $file | wc -l} <= 1 ) { return 1; } }

I would like to know if there is a quick and easy way to skip the header line of the output stream of ask_bigdb if appending to an existing file? Note that most of the print are actually system calls to other pipeline components ... they just weren't installed locally =)

Any help is much appreciated.


Smoothie, smoothie, hundre prosent naturlig!

In reply to Skiping the first line of data/output stream by j1n3l0

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.