My intention to sort by the third column in the output is creating me trouble with the code I wrote. This code is a follow up code on the output generated with another code (Title: Reverse complement) bluray's writeups. Now, my aim is to create unique identifiers for each of the 11 characters (tag) in the 1st column of the input file. The 2nd column is the frequency of each tag. I used this code to create a new first column. Each line of this column starts with >HWTI_frequency_randomnumber. Though, I was able to get the results with this code, I have trouble in sorting with the second column of my input file.
#!usr/bin/perl use strict; use warnings; my $input_file="file1.seq"; open my $FILE1, "<", $input_file or die "Cannot open $input_file .$!"; my $output_file="file2.csv"; open my $FILE2, ">", $output_file or die "Cannot open $output_file .$! +"; my $line=<$FILE1>; chomp $line; my @columnheadings=split(/\t/, $line); unshift(@columnheadings, ("Header")); my $heading=join("\t", @columnheadings); print $FILE2 "$heading\n"; my %tag; while (my $line=<$FILE1>) { chomp $line; $line=~s/\t/,/g; my @columns=split(/,/, $line); my $tags=$columns[0]; $tag{$tags}=$line; } foreach my $tags (sort keys %tag){ my $header; my @columns=split(/,/,$tag{$tags}); $tags=$columns[0]; my $freq=$columns[1]; my $range=500000; my $random_number=int(rand($range)); $header=">HWTI_".$freq."_".$random_number; my $printline=$tag{$tags}; $printline=$header.",".$printline; print $FILE2 "$printline\n"; }
In addition to the sorting issue, I am also thinking about doing a BLAST for each of these tags (#nucleotide length of 11). I will appreciate any suggestions in this matter.
In reply to Sorting issue by bluray
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |