This question is an improved version of my previous question (as people told it was ambiguous). So I have refined the question to get a correct answer.
My problem is : To parse a chemical file and load it in to the mysql database using CGI.
The format of chemical file is as below:
---------- data ---------- &&&& - delimiter for a single chemical file ---------- data ---------- &&&& - delimiter for a single chemical file ---------- data ---------- $$$$
My full file has 100's of such structures. below is the full file with DATA (for 3 chemicals)
(+)-catechin SMI2MOL 21 23 0 0 0 0 0 0 0 0999 V2000 0.0000 0.0000 0.0000 O 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 1 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 13 1 0 0 0 0 19 21 1 0 0 0 0 M END > <$NAM> (+)-catechin > <Formula> C15H14O6 > <MolWeight> 290.26806 > <ChemBankID> 1254 > <CompoundName> (+)-catechin > <Calbiochem Catalog> 219250 > <MicroSource Catalog> 210205 $$$$ (+)-himbacine SMI2MOL 25 28 0 0 0 0 0 0 0 0999 V2000 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 2 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 1 0 0 0 0 0 1 2 1 0 0 0 0 2 25 1 0 0 0 0 14 25 1 0 0 0 0 M END > <$NAM> (+)-himbacine > <Formula> C22H35NO2 > <MolWeight> 345.51884 > <ChemBankID> 1861 > <CompoundName> (+)-himbacine > <Calbiochem Catalog> 377200 $$$$ (+)-methamphetamine SMI2MOL 11 11 0 0 0 0 0 0 0 0999 V2000 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 1 2 1 0 0 0 0 2 3 1 0 0 0 0 10 11 1 0 0 0 0 M END > <$NAM> (+)-methamphetamine > <Formula> C10H15N > <MolWeight> 149.23284 > <ChemBankID> 1568 > <CompoundName> (+)-methamphetamine > <MicroSource Catalog> 1900033 $$$$ #I TRUNCATED THE FILE HERE LIMITING TO 3 CHEMICALS#####
task 1. First I need to print the above file in the following format :
the formula of chemical one is : C15H14O6
the MolWeight of chemical one is : 290.26806
the ChemBankID of chemical one is : 1254
the formula of chemical two is : C22H35NO2
the MolWeight of chemical two is : 345.51884
the ChemBankID of chemical two is : 1861
**mostly the formula, molweight, etc is contained one line below
><something....> some data....
task 2: I want to load the formula,molweight,chembankID in to the MYSQL database using CGI.
Thats it...I tried with arrays...but had problems when the full file size exceed several 100 mb as it used lot of memory RegEx didnt me fetch good luck
So please help me in this regard. I will be happy if you can offer full source code tested for result.
Thanks in advance...Edited by Chady -- added paragraph tags, code tags, and a readmore tag
In reply to parsing file using metacharacters -new
by myraja
in thread parsing using metacharacters
by myraja
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |