comment on

Hi there. I've written a script that is trying to remove a 20 character sequence from a column with a varying offset.

My script does what I want it to do, expect that when I print @spliceout, it contains whitespace between letters. I've tried

 
for (@spliceout) {  
    s/\s+$//;  
}
[download]

but this doesn't work. I think I have confused it splitting a column into individual elements. I'm not too sure.

My script is:

#!/usr/bin/perl -w
use strict;

my $inputfile1 = $ARGV[0];
open (FILE1, $inputfile1) or die "Uh oh.. unable to find file $inputfi
+le1"; ##Opens input file


my @file1 = <FILE1>; #loads inputfile1 data into array
close FILE1;


my @matches;
foreach my $file1 (@file1) {
    if($file1 =~ m/splic/) {
    push (@matches, $file1); ##loads matches into array @matches
}
}

my @col1; ## column 1
my @col_ID; ## column 2
my @col3; ## column 3
my @col_strand_direction; ## column 6
foreach my $match(@matches) { ## process each line, splitting columns 
+and move onto next line
    my @colsplit = split("\t", $match);
    push (@col3, $colsplit[2] . "\n"); ##pushes third column to @col3 
+array
    push (@col1, $colsplit[0] . "\n");
    push (@col_ID, $colsplit[1] . "\n");
    push (@col_strand_direction, $colsplit[5] . "\n");
}



my @intron_from_boundary;
my @baseref;


foreach my $col3line(@col3) {
        if ($col3line =~ m/([\+|\-]\d+)\w+(\[[ACTG]])/) { ##pulls out 
++ or - and subsequent number and [base change]
        push (@intron_from_boundary, $1 . "\n"); ##$1 pushes what is i
+n the first set of brackets
        push (@baseref, $2 . "\n");
}
}

## need to take each intronmatch value and work out its position relat
+ive to intron/exon boundary

my $left_of_boundary;
my $intron_from_boundary;
my $new_left;
my @spliceout;

## split seq of @col1 into array

my $i = 0;
foreach my $col1(@col1) {
    my @col1split = split(//, $col1);

##for -7:

     $left_of_boundary = 10; ##10 to the left
     
        if ($col_strand_direction[$i] =~ m/\+/) {
     
        $left_of_boundary = $left_of_boundary + $intron_from_boundary[
+$i]; ##3 to the left

        $new_left = 23 - $left_of_boundary; ## 20
        

        }
        
                else {
        
        $left_of_boundary = $left_of_boundary - $intron_from_boundary[
+$i]; ##3 to the left

        $new_left = 23 - $left_of_boundary; ## 20
        
        }

        my @spliceout = splice @col1split, $new_left, 22; ##want to pu
+ll out 3 letters to left of [G] and 16 to the right }

print "@spliceout\n";
        
            open (MYFILE, '>>fasta');
            print MYFILE (">" . "$col_ID[$i]" , "@spliceout" , "\n"); 
+   
            close (MYFILE);
    
    ++$i; 
    }
[download]

Any help would be greatly appreciated, and yes, my scripting is rather messy, I'm still learning! Many thanks :)

In reply to It's all getting messy - remove whitespace by lecb

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.