Hello Monks! Hope you are all fine, Monks, to my knowledge i am applying the following regex in my code to extract only id's i.e: (first part of each line NM_030643.4,NR_029834.1,NM_001198855.1, AC067940.1)of the file. but it is returning this "APOL4) CYP2C8)" instead.
The file:
>NM_030643.4 Homo sapiens apolipoprotein L4 (APOL4) GAGGTGCTGGGGAGCAGCGTGTTTGCTGTGCTTGATTGTGAGCTGCTGGGAAGTTGTGACTTTCATTTTA CCTTTCGAATTCCTGGGTATATCTTGGGGGCTGGAGGACGTGTCTGGTTATTATATAGGTGCACAGCTGG >NM_001198855.1 Homo sapiens cytochrome P450 family 2 subfamily C memb +er 8 (CYP2C8) ACATGTCAAAGAGACACACAC >NR_029834.1 Homo sapiens microRNA 200a (MIR200A), microRNA CCGGGCCCCTGTGAGCATC >AC067940.1 Homo sapiens clone RP11-818E9, LOW-PASS SEQUENCE SAMPLING AAATACAACTTTAAATCAAAACGGTAAAAATTCCACTCTTTCATACTAACTTCAAAAGTATTTGCTTTAA AAAAAAAGNNNNNNNNN
open(GENBANK, "/Users/Desktop/Genes.fasta") or die; my $content = join("", <GENBANK>); close(GENBANK); sub mysub{ return shift =~ /(\w+\W+)\n/g; } my @matches = mysub($content); print "@matches\n";
In reply to Extracting string and numbers from a file by shabird
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |