Or without the
while loop (the file
chem.data holds the data specified in the OP):
>perl -wMstrict -ne
"print qq{$_\t}, join(qq{\t}, /([A-Z][a-z]?(\d*))(?=.*(\n?))/g);
"
chem.data
CH4N2O
C
H4 4
N2 2
O
C9H12N2O6
C9 9
H12 12
N2 2
O6 6
C5H11NO2
C5 5
H11 11
N
O2 2
C5H4N4O2
C5 5
H4 4
N4 4
O2 2
C10H11N4O9P
C10 10
H11 11
N4 4
O9 9
P
C10H12N4O6
C10 10
H12 12
N4 4
O6 6
C5H10O5
C5 5
H10 10
O5 5
C5H12O5
C5 5
H12 12
O5 5
C5H10O5
C5 5
H10 10
O5 5
C27H44O
C27 27
H44 44
O
C1694H2993O101
C1694 1694
H2993 2993
O101 101
Note, however, that this approach:
- produces a superfluous tab immediately before the newline in each 'individual chemical component' output line;
- produces oddball output for the last record (i.e., the last line) in the data file if it is not newline-terminated.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.