in reply to OpenOffice, XML and templates

I'm in the same boat (or will be before year's end) and I've been looking at this for awhile without diving in. I've seen some successful examples extending the package(s) with helpers like this (I think I klept this from some Japanese Perl hacker's blog a couple months back) one for working with Impress docs-

package OpenOffice::OODoc::Document; sub clone_page { my $self = shift; my ($source, $dest) = @_; my $p = $self->getElement(page_xpath($source)) or die; my $p2 = $p->copy(); $p2->paste(last_child => $self->getElement('//office:presentation' +)); $self->setAttributes($p2, 'draw:name' => "page$dest"); } sub page_xpath { my ($page) = @_; sprintf('//draw:page[@draw:name="page%d"]', $page); }

The lesson there being that it's XML::Twig underneath so you could use that directly with the OOD objects. You don't need to do any wrapping outside, just get into the guts of the objects directly with what's there already. Please submit patches back to the OOD author if you add anything generally functional.

A TT approach would be quite doable; perhaps by breaking pieces out into BLOCKs and MACROs so a given doc type could contain all its possible function/content in a glance. Any text munging can be done with TT if you come at it correctly. I like OOD though so the approach you (and I eventually) take should be based on the likelihood of OOD growing and getting better as more of us move away from MSFT-dependent packages... though a TT solution is certainly an interesting idea.

Replies are listed 'Best First'.
Re^2: OpenOffice, XML and templates
by psini (Deacon) on Jun 30, 2009 at 08:18 UTC

    I'm afraid I don't know TT well enough, so may be that there is a way to do it, but I can't see it.

    As I see it, the problem arises when you want to cut (or replicate) a block. Say you have the following fragment:

    <document> ... <para>paragraph #1</para> <para>paragraph #2</para> <para>paragraph #3</para> ... </document>

    If you want to programmatically cut away the second paragraph you have to surround it with TT commands, but this breaks XML integrity and, worse, it is not editable from OO writer.

    But if you put the command inside the <para> tag you can delete the text but end up with an empty paragraph.

    What I would need is a template language allowing a sort of look-ahead and look-behind (perhaps look-around is the right term?) so I can tell "remove this block and all the surrounding <para> tag". I don't know if TT, or another template processor, can do this.

    Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."

      Maybe you want Petal then - it embeds its templating language into the XML making up the document, so you can eliminate whole elements and their children. I never found it too pleasing, as it's only suitable for well-formed XML documents, but that might be a plus in your situation. The theoretical advantage is that you can edit the "sample" content within the templates and OOo will still output the attributes that make up the (Pe)Tal language. I haven't tried this in practice though.

      Back when I had to do templates of Word documents, I channelled most data through Microsoft Office Document Properties, but the templates didn't have a need for fancy tables with a variable amount of rows.

      I assume you've looked at using LaTeX to produce your output already - it's quite powerful but I'm not aware of whether the WYSIWYG editors have improved, as I'm content with the plain text editing.

        Yes, Petal is another possibility. The only problem is that it embeds the commands in tag's attributes, so it is still well formed XML but I can't edit the commands from within OO editor. :(

        This requires the person creating the template to open the ODT file, extract the XML part, edit it and repack... This procedure is certainly possible but requires an operator much more skilled than the ones for which my project is designed. And this is why I decided to avoid LaTeX that would be the obvious choice if only I could redesign the brains of my users

        Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."

Re^2: OpenOffice, XML and templates
by psini (Deacon) on Jul 05, 2009 at 22:39 UTC

    I found nothing of adequate to my needs, so I started working on it.

    The main idea is of a module that uses OpenOffice::OODoc::Document to manage the odt file, XML::Twig to manipulate the content, a HoAoH structured data block, and a minimal scripting language to describe the actions to be taken by the parser.

    I just wrote a draft and uploaded it in psini's scratchpad describing the language to implement; if you (or anyone) are interested in this project, any suggestion or critic is welcome: I thought about it in view of my specific needs at present, so I may have forgot some possible use; maybe the structure of the language is too simple or too complex and, last but not least, I know that my English is awful, so any grammar or stylistic correction is welcome too.

    Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."