Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello. I have a pipe delimited text file that contains many lines of numbers and stats. I need to use this text file to parse out all the information between pipes and output the needed information into an HTML table that resembles something like this:http://www.sportingnews.com/baseball/scoreboard/20010402/boxscore/10.html These files will be coming into a ftp folder and I must set up triggers to recognize the arrival of new information, parse the pipe delimited file, and output it into an html table with a specific name. I am very new to all of these ideas and any help is greatly appreciated. Thanks.

Replies are listed 'Best First'.
Re: Parse Pipe Delimited Text
by AgentM (Curate) on Apr 03, 2001 at 03:20 UTC
    You didn't specify the operating system, so I'll assume you're running a decent UN*X server. Let's handle this in chronological order of service:
    • To watch the FTP directory, use http://www.tripwire.com or a similar trigger mechanism. Otherwise, you'll need to set a cron job to read the directory once in a while which I consider polling and wasteful.
    • To parse the pipe-delimited text, take the easy route and use DBD::CSV which has a variable that you can set to the pipe. That way, you open up the world of SQL to yourself. If you imagine that that is overkill, a simple line-reading-spilt-on-pipes 3-liner would do fine.
    • To output the results of the parsing in the HTML table, use CGI. This will give you helpful functions to clean up your code that you may be writing in raw HTML and make sure that the end result is valid HTML.

    Good luck!

    AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.
Re: Parse Pipe Delimited Text
by tinman (Curate) on Apr 03, 2001 at 03:27 UTC

    I see three main problems that you need to solve here

    firstly, you need to parse a pipe delimited file.

    Ok, for this, you can use the CPAN module Text::CSV_XS
    Why this, you ask ? isn't CSV for "comma separated variables?".. true enough, but inside this wonderful module, you can also specify the separator character, which in your case would be a pipe.

    You could of course, also roll your own, with a simple call to the split function such as (this is very simplistic, probably too much so)

    open(DATA,"myfile") or die "No, can't open the file"; while(<DATA>) { my @elements = split('\|',$_); # do something with the elements here }

    Your second problem is to find out when a new file has arrived..

    First, store the modified date of the file with a call to the Perl inbuilt function stat, or by using the File::Stat module.

    Whenever your Perl script wakes up (cron job? or NT 'at' scheduler call)and decides to process the file, check the last modified date that you stored, with the one of the file. If they differ, ie: if the FTP folder file has a more recent date, that means you should parse the file again...

    Well, the third problem, outputting to an HTML table, is sort of simple, I think.. for completeness sake, I'll put a small pseudo-code like segment here that describes how it would be done...please note: this is untested

    open(INPUT,"myfile") or die "Can't open file"; print "<html><head><title>My page</title></head><body>"; print "<table>"; while(<INPUT>) { my @elements = split('\|',$_); print "<tr>"; foreach my $el(@elements) { print "<td>$el</td>"; } print "</tr>"; } print "</table></body></html>";

    HTH
    Update: Fixed the code segments.. ack, can't believe I didn't put that in :(.. much respect to MrNobo1024
    Update 2: Changed the name of the file handle from the reserved DATA to INPUT. DATA still works, but its not very good practice, as pointed out by others.
      Don't forget that the first argument to split is a regex, even if it looks like a string. split '|' dosen't work right, so try split '\|'.
Re: Parse Pipe Delimited Text
by jeroenes (Priest) on Apr 03, 2001 at 14:11 UTC
    Another possibility is the use of SuperSplit (another shameless plug). In the Synopsis of the embedded POD you can read how to use $a=supersplit('|'); to get your data and superjoin to get your table.

    Don't rule out CGI to create the table, it's very handy.

    Hope this helps,

    Jeroen
    "We are not alone"(FZ)

Re: Parse Pipe Delimited Text
by Anonymous Monk on Apr 03, 2001 at 03:43 UTC
    It's actually a NT server...anything similar to tripwire for NT? Thanks.