(Note: I realize that you may have no control over the project requirements. I also realize that the example you gave may be a toy example. But I feel I must comment anyway...)
As I see it, XML is bloated and ugly. However, it's useful because it allows you to make your data descriptive and easier parse and use in new ways. So I suggest that you change your schema, if possible. I don't really see how
<datafield tag="702" ind1="" ind2=""> <subfield code="a">Thomson, Bryden</subfield> <subfield code="b">1928-1991</subfield> <subfield code="c">Conductor</subfield> </datafield>
is any more descriptive than the original file. I feel you would be better served giving descriptive tags to your data. Perhaps something like:
<conductor> <Name> <Last>Thomson</Last> <First>Bryden</First> </Name> <Born>1928</Born> <Died>1991</Died> </conductor>
In my job, I *frequently* have to reverse engineer file formats, and I would greatly prefer to reverse engineer the first file format than the XML version, unless the tags were meaningful. Without meaningful field names, it just makes detecting meaningful patterns in the data more difficult with the visual clutter.
Just my $0.02.
...roboticus
In reply to Re: Converting text to XML; Millions of records.
by roboticus
in thread Converting text to XML; Millions of records.
by MikeEndo
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |