in reply to Adding Minimal Formatting To Unformatted Newbie Posts

First, many thanks for trying.

IMHO the KISS solution is:

  1. if there are no <pre> tags, surround the entire post with <code> tags.
  2. if there are <pre> tags and no other formatting
    1. replace <pre> tags with <code> tags
    2. if there is any text above the first <pre> tag, surround it with <code> tags
    3. if any text is between each pair of </pre> and <pre>, surround it with <code> tags
    4. if any text is below the last <pre> tag, surround it with <code> tags

    Note: I omitted the obvious simplification (remove the <pre> tags and surround the entire post with <code> tags) on the assumption that the post author sees something conceptually distinct from the rest of the text in whatever is surrounded with <pre> tags. It is possible that it would be valuable to allow that section to have its own download link.

The above algorithm won't make the text "pretty", but it will deal with the major sources of pain from badly formatted posts:

Attempting to insert both <c> and <p> tags is actually quite a difficult task because it requires us to distinguish between code and text. That is non-trivial. Since Perl borrows many words from English, it requires parsing not just the words but their context. I'm not surprised that you found the task too hard to do to your satisfaction in 2 days or so.

Best, beth

Update: explained why KISS doesn't include a very obvious simplification.

Replies are listed 'Best First'.
Re^2: Adding Minimal Formatting To Unformatted Newbie Posts (code)
by tye (Sage) on Jul 21, 2009 at 03:31 UTC

    That is extremely simple (just put code tags around and s=</?pre>=</code><code>=g, to restate it more tersely). I find the POD-like approach slightly more complex (see my earlier reply). However, I think your solution will almost never format the post as desired. I hope my POD-like solution will often format posts "correctly".

    Further, my solution makes it extremely simple to get a post formatted "correctly". Just separate code with blank lines and indent code. If we keep the idea that simple, then we should be able to get a lot of people to be able to swap in the requirements when posting.

    As for code in the middle of text, the main problem is square brackets. And I have a lot of ideas for making that mostly DWIM so I'm not concerned with that here (the square bracket problem needs to be fixed for chatter as well, which makes it almost orthogonal to this formatting problem).

    - tye        

      Yes it is very simple, even simplistic - by intent. It wasn't intended as a final solution but rather a stop gap (or scaffolding as you will) to address the major source of pain. Even if you were to reuse code from pod's implementation, I think you would find that applying the pod approach will require a certain amount of tweaking (see addendum below). Having a lot of DWIM ideas is great, but each idea adds implementation time however short. It adds up. And at least a few of those may turn into a "small matter of programming". I'm just saying while the kinks of those ideas are being worked out and tested, a very simple solution would buy us a great deal.

      The worst effects of this simple solution is that (a) we get ugly Courrier text instead of Times New Roman (b) normal text may look choppy because we lose the ability to wrap paragraphs (c) downloads may include a bunch of irrelevant stuff that needs to be deleted or commented out.

      But at least we will be able to read the post. Most of the unreadability in unformatted posts (and I mean here text with *no* tags except "pre") comes from the fact that unformatted text collapses all runs of whitespace outside "pre" tags into a single space. This makes code samples look like one long breathless mess. Only an obfu expert can read that.

      Best, beth

      Addendum: Although Pod has an algorithm for distinguishing text from code, even without markup, we can't just use it "as is". It relies on the assumption that the user will have normal text flush at the beginning of the line and code will be indented. This is often not the case if the user is cut&pasting from their code files. Here is a sample of code from node Can't call method "getAttribute" followed by output from pod2html sample.pod > foo.out. As you can see the output is unreadable:

      # this line added to bypass pod2html's error checking =head Dummy title # remainder is taken from node id=781506 #!/usr/bin/perl ################################################################# # Yahoo Weather Rss Information Atomizer # Version 0.7.1 # Loud-Soft.com # Provided As Is ################################################################# use strict; use XML::XPath; use LWP::Simple; use XML::XPath::XMLParser; use Getopt::Long; use File::Copy; ################################################################# # Variables ################################################################# # Constants (Change these to localize) my $zipcode = "60642"; my $unit = "F"; my $scripthome = "/Library/prlprograms/yweather.pl"; my $icondir = $scripthome."images/"; my $datadir = $scripthome."data/"; my $datafile = $datadir."weather.xml"; my $imagefile = $icondir."weather.png"; # Constants (Do not change these) my $pre="yweather"; my $uri="http://xml.weather.yahoo.com/ns/rss/1.0"; my $url="http://xml.weather.yahoo.com/forecastrss?p=$zipcode&u=$unit"; my %data; my $xp;

      The output looks like this:

      Dummy title

      #!/usr/bin/perl

      ################################################################# # Yahoo Weather Rss Information Atomizer # Version 0.7.1 # Loud-Soft.com # Provided As Is #################################################################

      use strict; use XML::XPath; use LWP::Simple; use XML::XPath::XMLParser; use Getopt::Long; use File::Copy;

      ################################################################# # Variables ################################################################# # Constants (Change these to localize) my $zipcode = ``60642''; my $unit = ``F''; my $scripthome = ``/Library/prlprograms/yweather.pl''; my $icondir = $scripthome.``images/''; my $datadir = $scripthome.``data/''; my $datafile = $datadir.``weather.xml''; my $imagefile = $icondir.``weather.png'';

      # Constants (Do not change these) my $pre=``yweather''; my $uri=``http://xml.weather.yahoo.com/ns/rss/1.0''; my $url=``http://xml.weather.yahoo.com/forecastrss?p=$zipcode&u=$unit''; my %data; my $xp;