Re: How to avoid an alphabet and integer next to it in a string?
by VincentK (Beadle) on Mar 21, 2014 at 21:01 UTC
|
Hi piscean.
I don't know much about chemistry, but I think I found a library that will help you.
Chemistry::File::Formula - Molecular formula reader/formatter
http://search.cpan.org/~itub/Chemistry-Mol-0.37/File/Formula.pm
I plugged the library into a basic script and it seems to parse the elements out in the way you need.
From here you should be able to parse each key/value in the formula hash and base your calcuation on the elements you need.
I hope this helps.
use strict;
use warnings;
use Data::Dumper;
use Chemistry::File::Formula;
while(<DATA>)
{
chomp;
my %formula = Chemistry::File::Formula->parse_formula("$_");
print "-" x 16, "\n";
print Dumper \%formula;
print "-" x 16, "\n";
}
__DATA__
C6H5OH
C6H9
Hg
Output:
C:\monks\calc_mass>perl calc_molecularmass.pl
----------------
$VAR1 = {
'H' => 6,
'O' => 1,
'C' => 6
};
----------------
----------------
$VAR1 = {
'H' => 9,
'C' => 6
};
----------------
----------------
$VAR1 = {
'Hg' => 1
};
----------------
C:\monks\calc_mass>
| [reply] [d/l] [select] |
Re: How to avoid an alphabet and integer next to it in a string?
by kcott (Archbishop) on Mar 21, 2014 at 20:01 UTC
|
G'day piscean,
Welcome to the monastery.
You can do that like this.
The tests include one- and two-letter symbols with and without numbers.
I'll leave you to replace my rough atomic weights with more precise ones.
#!/usr/bin/env perl -l
use strict;
use warnings;
my %weight = (C => 12, O => 16, Cl => 35.5);
my %tests = (
phenol => ['C6H5OH', 6 * 12 + 1 * 16],
chloroform => ['CHCl3', 1 * 12 + 3 * 35.5],
);
for my $compound (keys %tests) {
print '-' x 40;
print "Compound: $compound";
my $formula = $tests{$compound}[0];
print "Formula: $formula";
my $calculated = 0;
$formula =~ s{([A-Z][a-z]?)(\d*)}{
exists $weight{$1} and $calculated += $weight{$1} * ($2 || 1)
}eg;
print "Expected: $tests{$compound}[1]";
print "Calculated: $calculated";
}
Output:
----------------------------------------
Compound: phenol
Formula: C6H5OH
Expected: 88
Calculated: 88
----------------------------------------
Compound: chloroform
Formula: CHCl3
Expected: 118.5
Calculated: 118.5
| [reply] [d/l] [select] |
|
|
Thanks Ken! It was really helpful :) I've used Chemistry::MolecularMass module to have precise atomic weights.
| [reply] |
|
|
I haven't used Chemistry::MolecularMass previously (in fact, I wasn't aware of its existence until now); however, looking at its documentation, it would appear another (completely untested) solution would be:
use Chemistry::MolecularMass;
my $mm = Chemistry::MolecularMass::->new();
$mm->replace_elements(H => 0);
my $no_H_mass = $mm->calc_mass($your_formula);
| [reply] [d/l] |
Re: How to avoid an alphabet and integer next to it in a string?
by AnomalousMonk (Archbishop) on Mar 21, 2014 at 20:09 UTC
|
c:\@Work\Perl\monks>perl -wMstrict -le
"my $Hn = qr{ H (?! [[:lower:]]) \d* }xms;
my $not_Hn = qr{ (?! $Hn) }xms;
;;
use constant FORMULA => 'HC6H5OHHg2HeBr3H';
;;
my $s = FORMULA;
print qq{'$s'};
$s =~ s{ $Hn }''xmsg;
print qq{'$s'};
;;
$s = FORMULA;
my @elements = $s =~ m{ $not_Hn [[:upper:]] [[:lower:]]? \d* }xmsg;
printf qq{'$_' } for @elements;
"
'HC6H5OHHg2HeBr3H'
'C6OHg2HeBr3'
'C6' 'O' 'Hg2' 'He' 'Br3'
| [reply] [d/l] |
|
|
Thanks! This helped too :)
| [reply] |
Re: How to avoid an alphabet and integer next to it in a string?
by hippo (Archbishop) on Mar 21, 2014 at 17:53 UTC
|
I tried doing that, but I failed.
What did you try? How did it fail?
Hard to tell, but is all you are looking for this?
my $formula = 'C6H5OH';
$formula =~ s/H\d//g;
print "$formula\n";
Update: As runrig suggests below the more general s/H\d*//g; may be more appropriate to your needs. | [reply] [d/l] [select] |
|
|
Yes, this is what I was looking for. Thanks!
I tried avoiding H in C6H9. It turned out to calculate C69 giving me a wrong result. Of course, I was foolish enough to try this.
| [reply] |
|
|
my $molform = <STDIN>;
$molform =~ s/[^a-zA-G0-9]//g;
my $molmass = new Chemistry::MolecularMass;
my $mass = $molmass->calc_mass("$molform");
| [reply] [d/l] |
|
|
|
|
|
|
| [reply] |
|
|
Oops! This is giving me wrong output too. Hope the code I posted below gives an idea of what I wanted.
| [reply] |
|
|
$formula =~ s/H\d*//g;
| [reply] [d/l] |
|
|
|
|
|
|
|
|
|
|
|