in reply to pdf and ppt to text

To get text from a pdf, I use Text::FromAny.
To get text from a ppt, I use catppt from catdoc.

Prerequisites =>

wish from Tcl
catppt from catdoc

Module Prerequisites =>

#!/usr/bin/perl use strict; use warnings; use CPAN; CPAN::Shell->install(qw( XML::Twig Archive::Zip File::Temp Time::Local IO::File Any::Moose Try::Tiny Text::Extract::Word OpenOffice::OODoc File::LibMagic RTF::Parser HTML::FormatText::WithLinks CAM::PDF Text::FromAny));
Once the prereqs are satisfied, run this:
#!/usr/bin/perl use strict; use warnings; use File::Fetch; use Text::FromAny; my $ff1 = File::Fetch->new( uri => 'http://cpansearch.perl.org/src/KARMAN/SWISH-Filter-0 +.15/t/test.ppt'); my $ff2 = File::Fetch->new( uri => 'http://cpansearch.perl.org/src/KARMAN/SWISH-Filter-0 +.15/t/test.pdf'); my $where1 = $ff1->fetch( ) or die $ff1->error; my $where2 = $ff2->fetch( ) or die $ff2->error; my $tFromAny= Text::FromAny->new( file => 'test.pdf'); my $text = $tFromAny->text; print $text, "\n"; system("/usr/local/bin/catppt -lV"); print "\n"; system("/usr/local/bin/catppt test.ppt");

Replies are listed 'Best First'.
Re^2: pdf and ppt to text
by sarvan (Sexton) on Aug 04, 2011 at 10:42 UTC
    Hi Khen1950fx,

    Thanks for the help.. and when i run dependency installation code "File::LibMagic" installation seems to fail.. So, i tried to install it separately.. even then when i try to run perl MakeFile.PL it shows an error called "cant include magic.h"

    what is the problem here..
        Hi,

        I dont have magic.h file. And i usually read README file for every cpan installation and follow the instructions to install.

        In this also they said magic.h has common problem. But i dont know how to move it to proper location

        Can you help on that