in reply to How to avoid an alphabet and integer next to it in a string?

G'day piscean,

Welcome to the monastery.

You can do that like this. The tests include one- and two-letter symbols with and without numbers. I'll leave you to replace my rough atomic weights with more precise ones.

#!/usr/bin/env perl -l use strict; use warnings; my %weight = (C => 12, O => 16, Cl => 35.5); my %tests = ( phenol => ['C6H5OH', 6 * 12 + 1 * 16], chloroform => ['CHCl3', 1 * 12 + 3 * 35.5], ); for my $compound (keys %tests) { print '-' x 40; print "Compound: $compound"; my $formula = $tests{$compound}[0]; print "Formula: $formula"; my $calculated = 0; $formula =~ s{([A-Z][a-z]?)(\d*)}{ exists $weight{$1} and $calculated += $weight{$1} * ($2 || 1) }eg; print "Expected: $tests{$compound}[1]"; print "Calculated: $calculated"; }

Output:

---------------------------------------- Compound: phenol Formula: C6H5OH Expected: 88 Calculated: 88 ---------------------------------------- Compound: chloroform Formula: CHCl3 Expected: 118.5 Calculated: 118.5

-- Ken

Replies are listed 'Best First'.
Re^2: How to avoid an alphabet and integer next to it in a string?
by piscean (Acolyte) on Mar 21, 2014 at 20:46 UTC
    Thanks Ken! It was really helpful :) I've used Chemistry::MolecularMass module to have precise atomic weights.

      I haven't used Chemistry::MolecularMass previously (in fact, I wasn't aware of its existence until now); however, looking at its documentation, it would appear another (completely untested) solution would be:

      use Chemistry::MolecularMass; my $mm = Chemistry::MolecularMass::->new(); $mm->replace_elements(H => 0); my $no_H_mass = $mm->calc_mass($your_formula);

      -- Ken