Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

db2html

by straywalrus (Friar)
on Oct 29, 2001 at 02:20 UTC ( [id://121853]=CUFP: print w/replies, xml ) Need Help??

Takes a plain text database, delimited by pipes (|) and rips the html version.
#!/usr/bin/perl # Written by Stefan Edwards <The Stray Walrus> # email: straydog43@hotmail.com # usage: htmldb <filename> die "Fatal:Inncorrect Parameters: <$0> [filename]" unless $ARGV[0]; @file = split(/\./,$ARGV[0]); $output = $file[0].".html"; open (IN, "<$ARGV[0]") or die "Fatal:$!\n"; open (OUT, ">$output") or die "Fatal:$!\n"; print OUT "<html><head><title>$file[0]</title></head>\n"; print OUT "<body>\n"; print OUT "<table>\n"; while ($line = <IN>) { @linedb = split(/\|/,$line); print OUT "<tr>\n"; for($i = 0; $i < scalar @linedb; $i++) { chomp($linedb[$i]); print OUT "\t<td>".$linedb[$i]."</td>\n"; } print OUT "</tr>\n"; } close (IN); print OUT "</table></body></html>\n"; close (OUT);

Replies are listed 'Best First'.
Re: db2html
by blackmateria (Chaplain) on Oct 29, 2001 at 23:25 UTC
    This could definitely be useful. There are two things I would change though:
    1. The script assumes the input filename only has one dot. If the input file is named "my.first.flat.file.db," the output filename is going to be "my.html." This is not a good thing.
    2. The script doesn't escape characters like &, <, >, etc. This means you have to worry about whether the database contains HTML tags and even JavaScript. What if one of the rows contained something like this: <script language='JavaScript'>window.open ('http://www.hax0rsit3.bogus') ;</script>|column 2|column 3Bam! Instant security hole.

    Of course, you can always turn off JavaScript, and maybe you never intended to use the script on untrusted data, but IMO, it's never too early to think about security. Plus, as it stands the script doesn't handle ampersands and angle brackets properly. Why not just write the data to stdout (avoiding the filename issue) and use CGI.pm to format/escape the HTML?

    #!/usr/bin/perl -w use strict ; use CGI qw (:standard *table) ; die "Usage: $0 <input-filename-list>\n" unless @ARGV ; binmode STDOUT, ':crlf' ; print start_html (-title => join ('; ', @ARGV)), start_table, "\n" ; while (<>) { tr/\r\n//d ; my @cols = map {escapeHTML ($_)} split '\|' ; print TR (td ([@cols])), "\n" ; } print end_table, end_html ;

    I admit the output doesn't look as pretty as your nice hand-formatted output though.

      blackmateria, thank you for your input, I will try these things for this little project of mine. Don't worry about my 'pretty formatting', if my has nice formatting, but has some 'nice' security holes, what good is it? Thanx for pointing that out also because I did not even think of that, that's why the community is good
Re: db2html
by AltBlue (Chaplain) on Oct 31, 2001 at 03:38 UTC
    Why not using some CSV specialized module for parsing? You'll avoid this way having to handle 'special' cases like another field separator instead '|' or fields that contain the same char as the field separator etc.
    here is a simple example using Text::CSV_XS:
    #!/usr/bin/perl -w use strict; use Text::CSV_XS (); use CGI '-autoload'; my $csv = Text::CSV_XS->new({sep_char => '|','escape_char' => '\\',}); print start_html(), $/, start_table(); while(my $line = <DATA>) { $csv->parse($line) or next; print $/, start_Tr(), ( map { td(escapeHTML($_)) } $csv->fields() ), end_Tr(); } print $/, end_table(), $/, end_html(), $/; __DATA__ (foo\|bar)&muci++|5m28888kk|020022992 b"loat++\|<bizbaz>|115m28888kk|020022992333 bloat++"\|<bizbaz>|115m28888kk|020022992333 kk|maka\|paka|111kkio433i|kksk43992s3

    --
    AltBlue.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://121853]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2024-04-19 17:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found