My output looks like this:-NT_113797 CDS 122829 123323 - gene=LOC644591 ProteinID=X +P_932799.1 NT_113798 CDS 4457 4636 - NT_077932 CDS 9894 9928 - NT_077932 CDS 65297 65828 + NT_077932 CDS 89196 89690 - gene=LOC653505 ProteinID=BJD +ND993
I want it to be like this:-NT_113797 CDS 122829 123323 - NT_113798 CDS 4457 4636 - gene=LOC644591 NT_077932 CDS 9894 9928 - gene=LOC644591 NT_077932 CDS 65297 65828 + gene=LOC644591 NT_077932 CDS 89196 89690 - gene=LOC644591
My code looks something like this:-NT_113797 CDS 122829 123323 - gene=LOC644591 NT_113798 CDS 4457 4636 - gene=LOC653505 NT_077932 CDS 9894 9928 - gene=LOC653505 NT_077932 CDS 65297 65828 + gene=LOC653505 NT_077932 CDS 89196 89690 - gene=LOC653505
Thanks in advance cowboy :-)#!/usr/bin/perl use warnings; use strict; my $fn = $ARGV[0]; open(FH, "$fn") || die("cannot open:$!"); { my $geneName = ""; while(<FH>) { if($_ =~ /\A(\S+)\t(\S+)\t(\d+)\t(\d+)\t(\S)\s+$/) { print "\n$_ $geneName"; } if($_ =~ /\A(\S+)\t(\S+)\t(\d+)\t(\d+)\t(\S)\s+(\S+)\s+(\S+)\s+ +/) { $geneName = $6; } } }
In reply to Parsing help by cowboyrocks
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |