melutovich has asked for the wisdom of the Perl Monks concerning the following question:
We have a system that generates ODT files from a template plus client's response to many questions.
At various locations in the generated ODT file, line breaks get added in however their appearance in the ODT looks poorly formatted.
I've been tasked to in perl search the ODT content for the line breaks and convert them into a close of the parent tag and creation of a new parent tag with the same type and style.
Originally I tried to just use a few regex to split the content.xml (extracted using Archive::Zip), which worked on a few ODT files but is failing on more complicated XML
I was replacing a <text:line-break/> with a </text:p><text:p text:style-name="XXX"> however my solution fails when it encounters <text:span text:style-name="T71"><text:s/><text:line-break/></text:span> which became <text:span text:style-name="T71"><text:s/></text:p><text:p text:style-name="P104"></text:span> in which it is inserting the </text:p><text:p ...> before the </text:span> was closed.
Probably a regex based solution is too simple to handle the complex XML that might exist, so I probably will need a module that understands ODT and/or XML and with perl code can allow me to
Suggestions?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: fix ODT files with line breaks looking poor
by haukex (Archbishop) on Apr 07, 2019 at 15:40 UTC | |
by choroba (Cardinal) on Apr 07, 2019 at 20:45 UTC | |
by melutovich (Acolyte) on Apr 07, 2019 at 21:55 UTC | |
by haukex (Archbishop) on Apr 08, 2019 at 19:59 UTC | |
|
Re: fix ODT files with line breaks looking poor
by roboticus (Chancellor) on Apr 07, 2019 at 14:49 UTC | |
by melutovich (Acolyte) on Apr 07, 2019 at 15:20 UTC | |
|
Re: fix ODT files with line breaks looking poor
by Jenda (Abbot) on Apr 16, 2019 at 23:32 UTC | |
by haukex (Archbishop) on Apr 17, 2019 at 19:48 UTC | |
by Jenda (Abbot) on Apr 18, 2019 at 13:11 UTC |