jpoldot has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I have a file starting with:

fixedStep chrom=chrX start=1 step=1 0.930

0.955

0.972

0.985

0.993

0.995

0.994

0.990

0.984

0.971

0.942

0.944

0.971

fixedStep chrom=chrX start=200 step=1

0.987

The problem is that, apart from the numbers shown, there are these headers found in the file multiple times. I want to construct a program that stores the position and the value, that means:

1 0.930

2 0.955

etc

the numbering starts from where it is indicated by the number stored in the field start, that is start = 200 etc...So this means:

200 0.987

etc

I have constructed the following code:

#!/usr/bin/perl -w print "Please enter the filename of your file: "; $dnafilename = <STDIN>; chomp $dnafilename; unless ( -e $dnafilename) { print "File \"$dnafilename\" doesn't seem to exist!!\n"; exit; } unless ( open(DNAFILE, $dnafilename) ) { print "Cannot open file \"$dnafilename\"\n\n"; exit; } while ( <DNAFILE> ) { next if (/fixedStep chrom=chrX start=(.+) step=1/); $counter = $1; } print "$counter\t$_\n"; close DNAFILE; exit;

With awk, we could do it easily with the command: nawk -F' =' 'NF>1{start=$5;next}{print start++,$0}' myFile but I need a program...Any help would be greatly appreciated :)

  • Comment on How to continue numbering after a regular expression has been found
  • Download Code

Replies are listed 'Best First'.
Re: How to continue numbering after a regular expression has been found
by jethro (Monsignor) on Feb 11, 2011 at 12:30 UTC

    Your assignment to $counter is never executed at the right time because you do a next before it gets encountered. And if the 'if' is false, there shouldn't be an assignement to $counter but an increment. You probably want this:

    while ( <DNAFILE> ) { if (/fixedStep chrom=chrX start=(.+) step=1/) { $counter = $1; next; } print "$counter\t$_\n"; $counter++; close DNAFILE;

    Code is untested.

      Thank you very much...This works for me...Is there any way that I can store all these values into 2 arrays, one having the counter and the other the corresponding actual values???

        There is even an easy way :-)

        push the $counter to one array and the $_ to another instead of (or after the) print

        Rata
Re: How to continue numbering after a regular expression has been found
by Ratazong (Monsignor) on Feb 11, 2011 at 12:21 UTC
    while ( <DNAFILE> ) { next if (/fixedStep chrom=chrX start=(.+) step=1/); $counter = $1; } print "$counter\t$_\n";

    There are some issues with your code:

    • if you find a heading-line, you leave the loop - the assignment $counter = $1 will not be executed in this case
    • according to your spec you want to print all non-heading-lines => the print needs to be inside the loop
    • you want to increase the counter with your lines => you need a statement line $counter++; somewhere
    The following code might help you (untested!!)
    while ( <DNAFILE> ) { if (/fixedStep chrom=chrX start=(.+) step=1/) { $counter = $1; } else { print "$counter\t$_\n"; $counter++; } }
    as you see, it is quite similar to your awk-code.

    HTH, Rata