dchandler has asked for the wisdom of the Perl Monks concerning the following question:
I have a bunch of files that contain table data. Some of them are html, some of them appear to be sgml? I am not familiar with SGML but I see tags like "page" and also "C" which i guess is columns. Are there any easy ways to extract sgml (are there any addons like HTML::parser for sgml?) Also, how do i identify the markup language of a file? I simply read all the files through perl and then printed them into text files using perl so I've hidden the issue. Now I have many text files that have tables in them? So my next question is, are there easy ways to extract table data from text? Or are plain ol' regular expressions the way to go on this? I'm fairly adept at regular expressions but don't want to senselessly make them if there are already addons that do this.
Thanks,
Dana
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: question about extracting data tables
by b10m (Vicar) on Dec 26, 2004 at 21:52 UTC | |
|
Re: question about extracting data tables
by jZed (Prior) on Dec 26, 2004 at 22:03 UTC | |
by dchandler (Sexton) on Dec 27, 2004 at 00:01 UTC | |
by revdiablo (Prior) on Dec 27, 2004 at 00:06 UTC | |
by jZed (Prior) on Dec 27, 2004 at 00:28 UTC |