Helan_Ahmed has asked for the wisdom of the Perl Monks concerning the following question:
Hi all, I am new user in perl, I want to extract two values from XML file and save them in new file. The values are ssId and subSnpClass. I create the following code, but it just print ssId twice and does not print subSnpClass. Any idea about how to fix it. Thanks
my $Gene='ds_chY.xml'; my $filename = 'ID.txt'; my $ID; my $subSnpClass; open(my $fh, '>', $filename) or die "Could not open file '$filename' $ +!"; print $fh "RS_SNPs\tsubSnpClass"; open(GTFFILE, $Gene) or die ("Cannot open the file"); while(<GTFFILE>){ $_=~ s/^\s+//; if ($_= ~/ssId=\s*?(\S+)/) { $ID=$1; $ID =~ s/"//; chop $ID; print $fh "$ID\n"; } if ($_= ~/subSnpClass=\s*?(\S+)/) { $subSnpClass=$1; $subSnpClass =~ s/"//; chop $subSnpClass; print $fh "\t$subSnpClass\n"; } }
Here is sample of my input file
===========================================================
<?xml version="1.0" encoding="UTF-8"?> <ExchangeSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml +ns="http://www.ncbi.nlm.nih.gov/SNP/docsum" xsi:schemaLocation="http: +//www.ncbi.nlm.nih.gov/SNP/docsum ftp://ftp.ncbi.nlm.nih.go v/snp/specs/docsum_3.4.xsd" specVersion="3.4" dbSnpBuild="144" generat +ed="2015-05-26 09:54"> <SourceDatabase taxId="9606" organism="human" gpipeOrgAbbr="hs"/> <Rs rsId="3894" snpClass="snp" snpType="notwithdrawn" molType="gen +omic" genotype="true" bitField="050028000005130500030100" taxId="9606 +"> <Het type="est" value="0.05" stdError="0.1547"/> <Validation byCluster="true" byOtherPop="true" byHapMap="true" + by1000G="true"> <otherPopBatchId>7179</otherPopBatchId> </Validation> <Create build="36" date="2000-09-19 17:02"/> <Update build="144" date="2015-05-07 10:52"/> <Sequence exemplarSs="491581208" ancestralAllele="C,C"> <Seq5>ATAAGCAAATAACTGAAGTTTAATCAGTCTCCTCCCAGCAAGTGATATGCAA +CTGAGATTCCTTATGACACATCTGAACACTAGTGGATTTGCTTTGTAGTAGGAACAAGGTACATTCGCG +GGATAAATGTGGCCAAGTTTTATCTGCTGCCAGGGCTTTCAAATAGGTTGACCTGACAA TGGGTCACCTCTGGGACTGA</Seq5> <Observed>C/T</Observed> <Seq3>AATTAGGAAGAGCTGGTACCTAAAATGAAAGATGCCCTTAAATTTCAGATTC +ACAATTTTTTTTTCTTAGTATAAGCATGTCCCATGTAATATCTGGGATATACTCATACCTTTAAAAATG +TGCTCATTGTTTATCTGAAATTCACATTTTAACAGGGAACCATTGTTTTGTTATTGTTT ATTGTTTTGTTTCTAAATAA</Seq3> </Sequence> <Ss ssId="3931" handle="OEFNER" batchId="489" locSnpId="M3" su +bSnpClass="snp" orient="forward" strand="bottom" molType="genomic" bu +ildId="36" methodClass="DHPLC" validated="by-cluster"> <Sequence> <Seq5>TAATCAGTCTCCTCCCAGCAAGTGATATGCAACTGAGATTCCTTATGA +CACATCTGAACACTAGTGGATTTGCTTTGTAGTAGGAACAAGGTACATTCGCGGGATAAATGTGGCCAA +GTTTTATCTGCTGCCAGGGCTTTCAAATAGGTTGACCTGACAATGGGTCACCTCTGGGA CTGA</Seq5> <Observed>C/T</Observed> <Seq3>AATTAGGAAGAGCTGGTACCTAAAATGAAAGATGCCCTTAAATTTCAG +ATTCACAATTTT</Seq3> </Sequence> </Ss> <Ss ssId="76536062" handle="AFFY" batchId="52074" locSnpId="AF +FY_6_1M_SNP_A-8397107" subSnpClass="snp" orient="forward" strand="bot +tom" molType="genomic" buildId="130" methodClass="hybridize " validated="by-submitter"> <Sequence> <Seq5>TCACCTCTGGGACTGA</Seq5> <Observed>C/T</Observed> <Seq3>AATTAGGAAGAGCTGG</Seq3> </Sequence> </Ss>
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Print multiple value from file
by kcott (Archbishop) on Aug 10, 2015 at 11:49 UTC | |
|
Re: Print multiple value from file
by Laurent_R (Canon) on Aug 10, 2015 at 09:54 UTC | |
|
Re: Print multiple value from file
by poj (Abbot) on Aug 10, 2015 at 09:54 UTC | |
by Helan_Ahmed (Initiate) on Aug 10, 2015 at 10:11 UTC | |
by poj (Abbot) on Aug 10, 2015 at 10:50 UTC | |
by Helan_Ahmed (Initiate) on Aug 10, 2015 at 11:09 UTC | |
by poj (Abbot) on Aug 10, 2015 at 11:35 UTC | |
|
Re: Print multiple value from file
by RichardK (Parson) on Aug 10, 2015 at 09:54 UTC | |
|
Re: Print multiple value from file
by Monk::Thomas (Friar) on Aug 10, 2015 at 13:58 UTC | |
by Helan_Ahmed (Initiate) on Aug 10, 2015 at 14:06 UTC | |
by Monk::Thomas (Friar) on Aug 10, 2015 at 14:18 UTC | |
by Anonymous Monk on Aug 10, 2015 at 14:22 UTC | |
|
Re: Print multiple value from file
by Helan_Ahmed (Initiate) on Aug 10, 2015 at 10:06 UTC | |
by Laurent_R (Canon) on Aug 10, 2015 at 10:15 UTC | |
by Helan_Ahmed (Initiate) on Aug 10, 2015 at 10:20 UTC |