I would assume this has something to do with Open Office XML.
http://en.wikipedia.org/wiki/Office_Open_XML
Microsoft states that there are 4 reserved characters: <, >, &, and %. (
http://msdn.microsoft.com/en-us/library/ms145315%28v=sql.90%29.aspx). I can't find a W3C document that calls this out specifically, but I deal with XML on a regular basis and use this to sanitize user input and it works properly with XML::Twig.
Those 4 characters are probably transformed into entities as listed on the link above when they were saved into the document.