comment on

Hi Perlmonks,

This post is a followup to the one I posted earlier. My original aim was to match a four letter word ('ABCD') in the input file rows and then report 10 succeeding characters after the matched word in the output file. I was able to do that. Then, I found that 'ABCD' is repeated twice in each row with 22 characters between them. This 22 characters should be split into two (11 characters each) and should be reported in the same column. I was able to do that also. But, now my problem is to get a heading for the file. The code below gives the heading, but it is repeated after every 2 lines (22 characters). Also, I need to reformat the file to give the frequency of each unique row in the next column (in effect reduces the number of rows ).


#!usr/bin/perl -w
use strict;
use warnings;

my @input_files=<*.seq>;
my $local_count=0;

my %hash;
foreach my $input_file(@input_files)
{
unless (open(INPUT, $input_file))
{
    print "Cannot open file \"$input_file\"\n\n";
    exit;
}

my $sequence='ABCD';
my @headings=('Tags', 'Frequency');
my $headings=join("\t",@headings);
while (my $line=<INPUT>)

{
    
   if ($local_count==0){
    my $outfile=$input_file;
    $outfile=~s/.seq/.tag.txt/gi;
    unless (open (OUTPUT, ">$outfile"))
    {
        print "Cannot open file \"$outfile\"\n\n";
        exit;
    }
    }
    chomp $line;
 
   
   foreach($line=~m/$sequence/i){
             if ($line=~m/$sequence(.{11})(.{11})$sequence/){
         print OUTPUT  "\n",$headings,"\n",$1,"\n",$2;
                                
       }
         $local_count++;
         
    }
    
    }


}
[download]

The output I am getting now is in this format below:


Tags        Frequency
CDDDDDDDDDD    
BCDDEDDDDDR    
Tags       Frequency    
CDEDEDDDESE
CEEESEEDESE    
Tags       Frequency
[download]

In reply to Creating a column of frequency for the unique entries of another column by bluray

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.