Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

repairing Word-XML-files

by neniro (Priest)
on Feb 21, 2005 at 10:16 UTC ( [id://432997]=CUFP: print w/replies, xml ) Need Help??

Today someone asked me to repair a word-document he got by mail. After trying to open it with an uptodate-version of Word (for the case that it's new - not broken), I opened it using an editor and i saw XML, so I renamed it to .xhtml and tried to open it using the firefox. I got an "XML istn't wellformed"-error. Taking a closer look to the file in the editor I recognised that there are a lot linebreaks at the wrong positions - it seems that an emailclient added those for whatever reason. That's an easy problem to solve - using a well known perl-oneliner:
perl -pi.bak -ne "s/\n//g" filename.doc

Replies are listed 'Best First'.
Re: repairing Word-XML-files
by Jaap (Curate) on Feb 21, 2005 at 19:56 UTC
    How does that -p work? I found this:
    -p This option places a loop around your script. It will automatically read a line from the diamond operator, execute the script, and then print $_. It is most often used with the -e option.
    So what does -pi.bak do?
      -p -i.bak ... -i means to do the work in-place and save off the old version to <filename>.bak.

      Being right, does not endow the right to be rude; politeness costs nothing.
      Being unknowing, is not the same as being stupid.
      Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
      Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://432997]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-25 08:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found