in reply to Automatic spell checking when previewing

I also think a spell check on the site is probably a bad idea. It would need JavaScript to be useful (auto correct in the form fields) and personal dictionaries would be a problem; a universal one would be constantly corrupted by users who believed they could spell. :) Safari on Mac has spell checking in form fields and I think the Mozilla family also has plugins for spelling. That said, checking isn't too hard, even the comments in code could be checked pretty easily. Here's one way...

use warnings; no warnings 'uninitialized'; use strict; use Perl::Tidy; use HTML::TokeParser; use Text::Aspell; my @To_Check; my $input = join '', <DATA>; my $p = HTML::TokeParser->new( \$input ) || die "Couldn't create HTML::TokeParser object"; my $speller = Text::Aspell->new() || die "Couldn't create speller."; while ( my $token = $p->get_token ) { push @To_Check, $token->[1] if $token->[0] eq 'T' and not $token-> +[-1]; if ( $token->[1] eq 'code' and $token->[0] eq 'S' ) { # look for text in comments my $code_text = $p->get_text('code'); perltidy( source => \$code_text, formatter => bless(\{}, __PACKAGE__) ); } } for my $block ( @To_Check ) { for my $wordish ( $block =~ /\b([-[:alpha:]']+?)\b/g ) { next if $wordish =~ /^\d+$/; unless ( $speller->check( $wordish ) ) { my @suggestions = $speller->suggest( $wordish ); @suggestions = splice(@suggestions,0,6); # max suggestions printf("%12s -->> %s\n", $wordish, join(', ', @suggestions ) ); } } } exit 0; #=============================================================== sub write_line { # we won't be writing anything my ( $ego, $tokens ) = @_; # I think comments can only be the last element so -1 push @To_Check, $tokens->{_rtokens}[-1] if $tokens->{_rtoken_type}[-1] eq '#'; } __END__ <p> This is a somewhat naeeve example, but here it is: <code>print "He +llo world!\n"; # can chekc this</co+de> </p> <code> use Perl::Tidy; # a one trick poney powerhouse use Text::Aspell; # a trick to install use HTML::TokeParser; # a sentimental favorite use PM::Whatnot; # a local moduel my $input = PM::Whatnot::filter_post(); # do speling check </co+de>

And the output:

naeeve -->> nerve, nae eve, nae-eve, nave, naive, never chekc -->> check, cheek poney -->> pone, pony, peony, pine, piny, phony moduel -->> module, model, moduli, modulo, mo duel, mo-duel speling -->> spelling, spieling, sparling, sapling, spilling, spoili +ng

update: added "+"s to the co+de, to make them close "better."
update 2: proof of concept code's a bit long so put it in readmores.