Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
I admit it, I'm one of those annoying people who will interrupt a conversation to point out spelling errors. I try to be discreet and mean no offense; my parents raised me to prefer a friendly correction once than to make the same mistake in more important settings.

However, blogs and other online forums are typically filled with egregious and repetitive and predictable errors. If only I could hide the errors from my browser, I would remain mellow and calm while the rest of the world's grammar decline went unchecked.

I pondered aloud to some friends about the best way to put a search-and-replace filter into my favorite web browser, and somebody suggested HTTP::Proxy.

This is scratch code without documentation. I've only tested this on Linux. The simplistic filter has trouble in rare cases where a typo is found inside tag attributes. I filtered a word processor's auto-corrections file and added a few blog-common errors myself. I stripped out any non-ASCII fixes for simplicity. To use it, run this typoxy proxy in the background and configure your browser to access the web through it.

#!/usr/bin/perl use strict; use warnings; #use Data::Dumper; my $Port = 8080; my $Highlight = 1; #---------------------------------------------------------- my $Pre = $Highlight? '<span style="background: #ffffcc; color: #800000">' : ''; my $Post = $Highlight? '</span>' : ''; my @Typos = (); open(TYPO, "$ENV{HOME}/.typo") and do { @Typos = (); while (<TYPO>) { chomp; my ($wrong, $right) = split /\t+/; next if not $right; next if length($wrong) < 2; push(@Typos, [ $wrong, $right ]); push(@Typos, [ ucfirst($wrong), ucfirst($right) ]) if ucfirst($wrong) ne $wrong; } close(TYPO); }; die "No typos loaded from ~/.typo.\n" if not @Typos; print STDERR "$0: ", scalar @Typos, " typos filtered on port $Port.\n" +; # Longer corrections first. @Typos = map { $_->[1] } sort { $b->[0] <=> $a->[0] } map { [ length($_->[0]), $_ ] } @Typos; # Spaces are lenient. $_->[0] =~ s/ \s+ /\\s+/gx foreach @Typos; # Precompile the correction patterns. $_->[0] = qr/ (?<! [<>] ) \b ( $_->[0] ) \b/x foreach @Typos; #print Dumper $Typos[0], $Typos[-1]; #---------------------------------------------------------- use HTTP::Proxy; my $proxy = HTTP::Proxy->new(port => $Port); $proxy->push_body_filter( response => \&typo_filter ); $proxy->start(); #---------------------------------------------------------- sub typo_filter { foreach (@Typos) { ${$_[0]} =~ s|$_->[0]|$Pre$_->[1]$Post|g; } }

Without benchmarking, it seems to affect connect times more than it affects actual rendering time, even with 900+ typos in the ~/.typo configuration file. A sample typo list is http://www.halley.cc/.typo; it's just a list of tab-delimited lines: "definatly\tdefinitely\n". It's set to highlight errors in red on yellow (so you can see it working), but turning that off is a trivial parameter.

--
[ e d @ h a l l e y . c c ]


In reply to Typoxy: A Pox on Typos by Proxy by halley

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-03-28 19:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found