reciter has asked for the wisdom of the Perl Monks concerning the following question:
hello perl monks
I am trying to find palindromic sequence from file, which contains multiple sequences.
the script is providing result of palindromes present in file but it is not able to print the title of sequence
also after using "use strict and use warnings" script stop running and shows compilation error
please help me
here is the script I am using
#!/usr/bin/perl open (TEXT, "sample.txt")||die"Cannot"; my $pat=qr'^(Contig +([0-9]*))\s'; my $count = 0; for my $n (5..20) { my $re = qr /[CAGU]{$n}/; $regexes[$n-5] = $re; } NEXTLINE: while ($count < 1000) { my $line = <TEXT> ; $count++; foreach my $value (@regexes) { my $start = 0; while ($line =~ /$value/g) { my $endline = $'; my $match = $&; my $revmatch = reverse($match); $revmatch =~ tr/CAGU/GUCA/; if ($endline =~ /^([CAGU]{0,15})($revmatch)/) { $start = 1; my $palindrome = $match . "*" . $1 . "*" . $2; $palhash{$palindrome}++; } } if ($start == 0) { goto NEXTLINE; } } } print "L.$pat\n"; close TEXT; while(($key, $value) = each (%palhash)) { print "$key => $value\n"; } exit;
Here is sample.txt example
Contig1 NAAAAGUAUAGGCUCGAGAGAGAAGUCCUGGCCUAUCGGAUUACACCACGNGUCAGAUCU GUCACUUCAAGGAGCUUUUCAGCGUCUUUGACAAGAAUGGCNGACGGUUGCCGUCCUUCC AGGGAGAUUGGGGCAGAAUUCGAGUCACUGCCNGUGGAUUCCUUAGUGAUCCAGAUGUCG GAUAUAUGUCCAAUAUAGCCAAUNGUUGUUGGACAUAGCACCAUCAAUCAUCAAGAUUGC UAUUUCGCCACCUCNAUGUAGAGUGGAAGAUCCAAAUGCGGGAAUGGAUUUCAAGGUUGG UUAAGNAUGGGGGCAGCGGAUGAGUUAACAGUACAGCAAGCUAGGCAGCUCCAUUUNUGA CUUGGAAUCUUCAUACGAUUCAUUUAUGGCAGCUUUGCCUAAUGUUGNGUACUUGAAAGU AAAAUCCAAUAUUUCCAUAUUUUUGUUUAGUGCUUCAAAUCUGGUGAGAACGGAUGUUUU AAGUUGGUAAAACAGUGUAUUCAUUUUGAAGAGUUCUUGGAUUGUUUAAAGCUCAAGAUG CUAUUUGUGUGCUGCUGAUUUGUCGUUCUAUGAGAAAGAAUAUAUGCUUUAAUUUGGUUU UGUAAUAUUAANUAUAUUCAUUCCCCUUGAUUGUUGUUUUGUNNNNNNNNNNNNNNNNNN NN Contig2 NUGUUUUUUCUUUACUCGGUGUCUCGUCGGUACUCGACGACUAAANGGGNUAUAGGAGGG GCGCGAACGNCGAAACGGGAGUGGGUACAAAGCGUGGUCCNGGAUAGACGAGAGCGGACG CGCCGGGUGAAGCUCGGCGCGACGCGAGGCANAGAGGGGCCCGCAGANNNNNNNNNNNNN NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN NNNNNNNNNNAAGNCCCAGGAUGCGCAUCGCCCCAGGGGUGAAACCCCCAUCCCAAAGAA UGCUNCUCCUNCGCGGUAGGGCAGCUNCCCGAAGCACCCGACCGCUUUNAGGCCANCCAU AUGAUAAAGNAACGUUGUGUGGUGAAUGGGAUGAAGAUGAUUGAAGNAGAGUAGAGUUUU GCCUCUANAUCUUGAUAUGUAUAUCUUUAAUUAUAUAANUAUAGCUCAUUAUAUUGUGCN UUAGCAUGUAAUAUUUAAGUCUAAAAUUAANUGGACCUCAGCUCGAGGUCGNCAUUCUUU GUUACUUUAGAUCAGAUCUGUANUUCCCUUUGUAUUGUUCAGGNUUUCCAACCAUAAAUU AUUGGUACUAUCUUNAUUGUUAUCAUAAUUGACGUGUGUUUAAGUUCNNNNNNNNNNNNN NNNNN
I want my result in this form:(but it is not happening)
Contig1 GACGG*UUG*CCGUC => 1 GAUGC**GCAUC => 1 CGCCG*GGUGAAGCU*CGGCG => 1 Contig2 CAAUC*AUCAA*GAUUG => 1 GAUAU*GU*AUAUC => 1 AAAAU*CCAAUAUUUCCAU*AUUUU => 1
PROBLEM SOLVED
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Palindrome sequence from file containing mutliple sequences
by choroba (Cardinal) on Feb 20, 2015 at 10:50 UTC | |
by reciter (Novice) on Feb 21, 2015 at 04:49 UTC | |
|
Re: Palindrome sequence from file containing mutliple sequences
by Anonymous Monk on Feb 21, 2015 at 00:38 UTC | |
by reciter (Novice) on Feb 21, 2015 at 04:52 UTC |