Old_Gray_Bear has asked for the wisdom of the Perl Monks concerning the following question:

I have recently fallen heir to the maintenance of a large library of HTML -- 200+ modules, somewhere between 40K and 100K lines of text. The content is a mixture of Templates (using Andy Wardley's Template-toolkit modules), imbedded Perl, JavaScript, CGI (using CGI.pm), and native HTML tags.

Most of it is pretty readable, in the sense that there are syntax and indentation rules that are (usually) followed. But it can be a bit of a long slog to trace through. After searching through a particularly obfuscated piece of "code" (something that dense is certainly not 'text'), I got to thinking about reformatting the pages into something a little more user-friendy, or at least consistant.

Figuring that someone must have already sorted out this particular Wheel, I went off to search CPAN and Google-Space for HTML Reformatters and Pretty Printers, ala perltidy(1). So far I have found several promissing candidates. I am soliciting input and opinions.

I have not included 'WebLint', 'HTML::Clean', 'HTML::Lint', etc, since my goal is to generate human (or at least Bear) readable text, rather than optimize my pages or validate the correctness of the tags. They will be useful tools, later in maintenance cycle, but they don't do what I want to do right now.

Have I missed anyone's favorite? Does anyone have opinions pro or con?

----
I Go Back to Sleep, Now.

OGB

  • Comment on What do Monks Recommend For HTML Reformatters?

Replies are listed 'Best First'.
Re: What do Monks Recommend For HTML Reformatters?
by allolex (Curate) on Dec 01, 2003 at 07:37 UTC

    HTML tidy is really the way to go. Not only does it do a decent job of prettyprinting/indentation (-i), but it also helps fix serious problems with converting to standards-compliant markup. For example, you can move all of your font formatting to CSS styles by using the -clean option. It also supports conversion of HTML to XHTML or XML (see -asxhtml; -asxml). Oh, and it's really quick to do all of this. Long live free software!

    --
    Allolex

Re: What do Monks Recommend For HTML Reformatters?
by Aristotle (Chancellor) on Dec 01, 2003 at 01:21 UTC
    Personally I just use HTML Tidy. I've never had a need for anything more than it does, which is plenty out of the box.

    Makeshifts last the longest.

Re: What do Monks Recommend For HTML Reformatters?
by Roger (Parson) on Dec 01, 2003 at 01:20 UTC
    Have you tried any of the HTML formatters you mentioned above yet?

    Personally I think HTML tidy is pretty cool, but you will just have to experiment and see which one of the tools gives the best results.

Re: What do Monks Recommend For HTML Reformatters?
by tune (Curate) on Dec 01, 2003 at 09:40 UTC
    This topic is pretty interesting to me too. I am coding in Mason right now, and wondering if HTMLTidy will respect the Mason blocks or messing them up. Has anyone any experience with this?

    --
    tune

      Depends on how Mason (which I have no idea about) handles embedding its own stuff in the markup. If it uses processing instructions (<?foo · · · ?>) then HTMLTidy may reindent the opening part of the tag (<?foo) at will and may reindent any following lines as a block, but it won't touch the content. For the majority of languages that means there should be no change, though it might not be what you want.

      Unfortunately - and that is the one thing that annoys me about HTMLTidy -, there's no way to specify a list of candidate tags for reindenting or a list of tags not to reindent.

      Makeshifts last the longest.

        Mason is having several forms to markup its own stuff. And it's pretty unique. E.g.
        % my $str = "This line is Perl until the linefeed."; # cannot contain +whitespace at the beginning!!! <%perl> my $str2 = "This is a Perl block"; </%perl> <%args> $a # this is a special block $b </%args> <h1>Hello <% $monkname %>!</h1> It is <& /lib/mason_components/weather_forecast.mas, fmt => 'Celsius' +&> celsius degrees outside!
        etc. I guess I will try to run Tidy, but not confident about it at all :-/

        --
        tune

Re: What do Monks Recommend For HTML Reformatters?
by chanio (Priest) on Dec 02, 2003 at 04:40 UTC
    Yes, HTML-Tidy rules!

    You might want to try (HTML-Kit) that is Freeware. And that has these modules: HTML-Tidy, HTML::Template and Perl-Tidy to edit your files.

    It shouldn't be considered as Open Source but those sort of modules (done in their own JS like language) are maintained by a big community of users.

    I cannot yet get used to editing all with it, but it might become very comfortable since you could test perl scripts, JS, HTML, create CSS, etc. It also has a sort of REGEX for massive file replacements.