A Word page break is a character \n in old DOC file format. Newer Word documents are DOCX files, which are essentially ZIP files containing several xml documents, one of which is called document.xml. This one contains the document text itself. I created a simple document with two lines "AAA" and "BBB" for example. This was the content in the document.xml file:

<w:body> - <w:p w:rsidR="00D96BA8" w:rsidRDefault="00D96BA8"> - <w:r> <w:t>AAA</w:t> </w:r> </w:p> - <w:p w:rsidR="00D96BA8" w:rsidRDefault="00D96BA8"> - <w:r> <w:t>BBB</w:t> </w:r> </w:p> - <w:sectPr w:rsidR="00D96BA8" w:rsidSect="00354B3C"> <w:pgSz w:w="12240" w:h="15840" /> <w:pgMar w:top="1008" w:right="1008" w:bottom="1008" w:left="1008" w +:header="720" w:footer="720" w:gutter="0" /> <w:cols w:space="720" /> <w:docGrid w:linePitch="360" /> </w:sectPr> </w:body>

and this was the DOC file hex dump somewhere in the middle. I am not going to copy the entire file here. I know, some of you are like "whew!" lol

Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 000009B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 000009C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 000009D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 000009E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 000009F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 00000A00 41 41 41 0D 42 42 42 0D 00 00 00 00 00 00 00 00 AAA.BBB.... +..... 00000A10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 00000A20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 00000A30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 00000A40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +..... 00000A50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........... +.....

In reply to Re^2: Can Perl generate a page break character that Microsoft Word will recognize? by harangzsolt33
in thread Can Perl generate a page break character that Microsoft Word will recognize? by Intermediate Dave

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.