Word reads its own HTML very well
That is a very good thought. As horrid as Word HTML is to the naked eye HTML parser should let you whip through it with ease, editing the text but leaving the puke vomit markup formatting. Then as you say let Word convert its own excreta back into native format. This conversion is essentially just padding with huge numbers of null bytes for every real character, thus 'Hello World!' as a text file is 13 bytes but in .DOC format it needs a mere 19,456 :-)
In reply to Re^2: A copyeditor needs help to get started with a Perl project
by tachyon
in thread A copyeditor needs help to get started with a Perl project
by wordsmith
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |