sharkbait has asked for the wisdom of the Perl Monks concerning the following question:
Hello! I have a problem (repeated in comments). I have a search string $mystring that I am searching for a matching string against a NCBI text file, nothing complicated, basic regex. The only slight modification is when I find a match, I need the string immediately following the match, not the match itself. The problem is I'm stuck. I've narrowed it down to maybe a regex issue? But basically, the files, basic text files, load, but the input from the search pattern (it's in the code as a CSV because initially I set it as such but got fed up fighting with Text::CSV so I just used text instead) doesn't seem to recognize each line of the array as a string. When I do an if loop to debug, Perl doesn't seem to recognize search patterns as strings. Below are sample text copied and pasted. I'd greatly appreciate any help! Thanks and Happy Mardi Gras!!
BB_B10 BB_B29 BB_B18 BB_B13 BB_B14 BB_B12 BB_B04 BB_B16 BB_B22 BB_B17 BB_B27 BB_B19 BB_B07 BB_B23 BB_B09 BB_B02 BB_B28 BB_B24 BB_B03 BB_B05 BB_B06
/locus_tag="BB_B01" /db_xref="GeneID:1194411" CDS 46..324 /locus_tag="BB_B01" /note="catalyzes the hydrolysis of acylphosphate" /codon_start=1 /transl_table=11 /product="acylphosphatase" /protein_id="NP_046987.2" /db_xref="GI:364556794" /db_xref="GeneID:1194411" /translation="MYKQQYFISGKVQGVGFRFFTEQIANNMKLKGFVK +NLNDGRVEI VAFFNTKEQMKKFEKLLNGNKYSNIKNIEKIVLDENYPFQFNDFKIYY" misc_feature 46..321 /locus_tag="BB_B01" /note="acylphosphatase; Provisional; Region: PRK1 +4432" /db_xref="CDD:184678" gene complement 308..751 /locus_tag="BB_B02" /db_xref="GeneID:1194420" CDS complement 308..751 /locus_tag="BB_B02" /note="hypothetical protein; identified by Glimme +r; putative" /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="NP_046988.1" /db_xref="GI:11497029" /db_xref="GeneID:1194420" /translation="MKIGPHYFFKKILKSNDNRTIYISYLYDRLASVKP +AGEWLRIYF KDSKRGKKYFILFNRNSSNGSFISCSFLKTSCNCGLDIKFSDGNLNIFC +RNRKSLEFL KFKVEHFFRTSVSCYKNNNSYVHNIKPKNKVKVLVKREASPNNKF" gene complement 837..2186 /locus_tag="BB_B03" /db_xref="GeneID:1194419" CDS complement 837..2186 /locus_tag="BB_B03" /note="hypothetical protein; identified by Glimme +r; putative" /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="NP_046989.1" /db_xref="GI:11497028" /db_xref="GeneID:1194419" /translation="MPPKVKIKNDFEIFRKELEILYKKYLNNELSYLKL +KEKLKILAE NHKAILFRKDKFTNRSIILNLSKTRKIIKEYINLSVIERIRRDNTFLFF +WKSRRIKEL KNIGIKDRKKIEELIFSNQMNDEKSYFQYFIDLFVTPKWLNDYAHKYKI +EKINSYRKE QIFVKINLNTYIEIIKLLLNQSRDIRLKFYGVLMAIGRRPVEVMKLSQF +YIADKNHIR MEFIAKKRENNIVNEVVFPVFADPELIINSIKEIRYMEQTENLTKEIIS +SNLAYSYNR LFRQIFNNIFAPEESVYFCRAIYCKFSYLAFAPKNMEMNYWITKVLGHE +PNDITTAFH YNRYVLDNLDDKADNSLLTLLNQRIYTYVRRKATYSTLTMDRLESLIKE +HHIFDDNYI KTLIVIKNLMLKDNLETLAMVRGLNVKIRKAFKATYGYNYNYIKLTEYL +SIIFNYKL" gene complement 2476..3798 /locus_tag="BB_B04" /db_xref="GeneID:1194410" CDS complement 2476..3798 /locus_tag="BB_B04"
#!/usr/bin/perl -w use Text::CSV; my @arrayOfVals; open(my $tmp, "<", "/Users/bioinformatics/Desktop/NC_001903.gbk.txt") +|| die "Could not open $!"; while (<$tmp>) { chomp; push(@arrayOfVals, $_); } close($tmp); my @arrayFromCSV; open(my $tmpFile, "<", "/Users/bioinformatics/Desktop/cp26_diffexpr.tx +t") || die "Could not open $!"; while (<$tmpFile>) { chomp; push(@arrayFromCSV, $_); } close($tmpFile); foreach(@arrayFromCSV) { if ($_ eq "BB_B10") { print "MAtch!!\n"; } else {print "$_\n";} } my @wkArray = @arrayOfVals; my @secArrayFromCSV = @arrayFromCSV; while (my $matchCheck = shift @wkArray) { $csvVal = shift(@secArrayFromCSV); if ( $csvVal =~ /$matchCheck/) { my $geneValue = shift(@wkArray); print "$geneValue\n"; } else { print "no match"; } }
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Need Help with Maybe a Regex Issue
by hdb (Monsignor) on Feb 27, 2014 at 07:21 UTC | |
by sharkbait (Initiate) on Feb 27, 2014 at 15:43 UTC | |
Re: Need Help with Maybe a Regex Issue
by graff (Chancellor) on Feb 27, 2014 at 04:20 UTC | |
by sharkbait (Initiate) on Feb 27, 2014 at 15:11 UTC | |
by sharkbait (Initiate) on Feb 27, 2014 at 18:10 UTC | |
by tangent (Parson) on Feb 27, 2014 at 21:34 UTC | |
by sharkbait (Initiate) on Mar 06, 2014 at 19:22 UTC | |
Re: Need Help with Maybe a Regex Issue
by Kenosis (Priest) on Feb 27, 2014 at 00:13 UTC | |
by sharkbait (Initiate) on Feb 27, 2014 at 15:07 UTC |