Re^3: Perl has plenty of XML parsers, but is there an XML printer?

XML::Twig is the "kitchen sink" XML manipulating module. It does pretty much everything you would want to do in the context of XML and is widely used.

You may like to post a very small sample of what you need to parse and what your output should look like.

DWIM is Perl's answer to Gödel

Comment on Re^3: Perl has plenty of XML parsers, but is there an XML printer?

Replies are listed 'Best First'.
Re^4: Perl has plenty of XML parsers, but is there an XML printer? by yaneurabeya (Novice) on Jul 20, 2007 at 00:12 UTC
<?xml ver="blah"?> <sections> <build> <field name="build time"></field> <!-- This a non-delimited section of text that will be parsed . +.. --> <fields type="section of text 1"> <!-- A token-delimited section of text that will be parsed into + this area... --> </fields> </build> <run> <errors> <error time="some_integer" some_attr="more helpful info ab +out error gets put here" /> <!-- ... etc --> </errors> <mismatches> <mismatch time="another_integer" some_attr="more helpful i +nfo about mismatch goes here" /> <!-- ... etc --> </mismatches> <perf_stats> <stat type="performance item name">value</stat> <!-- stats about speed, test time, etc go here ... --> </perf_stats> </run> </sections> [download] Extra notes (about flat logs): - Flat logs which are parsed are 400~100k lines large. - Current system uses sections of text vs a 'gold log' (good existing known output), and does either specialized subset field checking or straight diff(1)'ing. This unfortunately is a bad idea with a large number of tests because the number of logs is ( 1000 tests * (1-2 logs) * (1-3 test sets) ) => 1000 ~ 6000 logs. So, if the file format changes (i.e. a new feature is introduced to the toolchain) so will all affected logs, and bringing the 'gold logs' up to date will consume a lot of unnecessary time, and this is going to occur in the future. Thus, by moving to a more structured data store, I can get away from the flat file's formats and get to a content based comparison system.	[reply] [d/l]

Replies are listed 'Best First'.

Re^4: Perl has plenty of XML parsers, but is there an XML printer?
by yaneurabeya (Novice) on Jul 20, 2007 at 00:12 UTC

<?xml ver="blah"?>
<sections>
    <build>
        <field name="build time"></field>
        <!--
            This a non-delimited section of text that will be parsed .
+..
        -->
        <fields type="section of text 1">
        <!--
            A token-delimited section of text that will be parsed into
+ this area...
        -->
        </fields>
    </build>
    <run>
        <errors>
            <error time="some_integer" some_attr="more helpful info ab
+out error gets put here" />
            <!--
                ... etc
            -->
        </errors>
        <mismatches>
            <mismatch time="another_integer" some_attr="more helpful i
+nfo about mismatch goes here" />
            <!--
                ... etc
            -->
        </mismatches>
        <perf_stats>
            <stat type="performance item name">value</stat>
            <!--
                stats about speed, test time, etc go here ...
            -->
        </perf_stats>
    </run>
</sections>
[download]

Extra notes (about flat logs): - Flat logs which are parsed are 400~100k lines large. - Current system uses sections of text vs a 'gold log' (good existing known output), and does either specialized subset field checking or straight diff(1)'ing. This unfortunately is a bad idea with a large number of tests because the number of logs is ( 1000 tests * (1-2 logs) * (1-3 test sets) ) => 1000 ~ 6000 logs. So, if the file format changes (i.e. a new feature is introduced to the toolchain) so will all affected logs, and bringing the 'gold logs' up to date will consume a lot of unnecessary time, and this is going to occur in the future.

Thus, by moving to a more structured data store, I can get away from the flat file's formats and get to a content based comparison system.

[reply]
[d/l]