This comes up regularly so here is short script to convert a text file into HTML so that it will display correctly in a browser - ie looks like it does in a text editior. The new file has a .htm extension added.
HTML special chars are escaped and tabs rendered as 4 spaces. Sequences of spaces longer that 1 are converted to corresponding number of so that the whitespace formatting is retained (not required if the output is wrapped in <pre> tags - but does not hurt). In the exapmple <pre> tags are wrapped around the escaped output so newlines retain their literal meaning. If you wanted to use other tags (say <tt>) you will need to uncomment the s/\n/<br>\n/g line to complete the escape process.
Yes there are modules out there that will do this. This is what they do in a nutshell for the curious. They probably won't do the [PerlMonks] escapes though :-)
#!/usr/bin/perl -w my $text = "c:/text.pl"; open TEXT, $text or die "Oops can't open $text $!"; open HTML, ">$text.htm" or die "Oops can't write $text.htm $!"; print HTML "<pre>\n"; while (<TEXT>) { $_ = escapeHTML($_); print HTML $_; } print HTML "</pre>\n"; close HTML; close TEXT; sub escapeHTML { local $_ = shift; # make the required escapes s/&/&/g; s/"/"/g; s/</</g; s/>/>/g; # change tabs to 4 spaces s/\t/ /g; # make the whitespace escapes - not required within <pre> tags s/( {2,})/" " x length $1/eg; # make the brower bugfix escapes; s/\x8b/‹/g; s/\x9b/›/g; # make the PERL MONKS escapes (if desired) s/\[/[/g; s/\]/]/g; # change newlines to <br> if desired - not required with <pre> # s/\n/<br>\n/g; return $_; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Text to HTML
by shenme (Priest) on Aug 24, 2003 at 02:18 UTC |