in reply to Extracting text from MS Word files on a Linux box
The following works for me with LibreOffice 5.1:
use IPC::System::Simple qw/capturex/; my $text = capturex('libreoffice', '--convert-to', 'txt:Text (encoded):UTF8', $filename, '--cat', '--headless'); utf8::decode($text); $text=~s/\A\x{FEFF}//; # remove BOM
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Extracting text from MS Word files on a Linux box
by Laurent_R (Canon) on Jun 21, 2018 at 11:50 UTC | |
by haukex (Archbishop) on Jun 21, 2018 at 11:57 UTC |