There seem to exist a lot of different options with very dfferent complexities, paired with different word-formats.
If it's a plain old word 2000-2003 file and you already know what your tables look like and you need only some data from within some cells, you could do simply a:
and then:$> abiword --to=rtf myworddocument.doc
$> perl extract-table-cells.pl myworddocument.rtf
in the latter (extract-table-cells.pl), you would simply search for:
[pseudo] ... # table content part already extracted to $tablecontent @cells = $tablecontent =~ /} ([^}]*) }\\cell{/xgs; ...
which might give you the cells in @cells.
But it depends on your problem. Of what scale and purpose is your attempt?
Regards
mwa
In reply to Re: acessing the data from word(.doc) file in linux environment
by mwah
in thread acessing the data from word(.doc) file in linux environment
by mahesh1532
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |