Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Larry vs. Joel vs. Ovid

by Ovid (Cardinal)
on Nov 20, 2001 at 23:48 UTC ( [id://126596]=perlmeditation: print w/replies, xml ) Need Help??

In the latest edition of Joel On Software, Joel pulls out a Larry Wall quote:

People understand instinctively that the best way for computer programs to communicate with each other is for each of the them to be strict in what they emit, and liberal in what they accept.

Joel makes it pretty clear that he thinks this is a stupid idea. Frankly, so do I.

Why is Design By Contract considered such a good idea? Because a very explicit contract exists between the client and the server: I, as the server, tell you, the client, exactly what you're allowed to give me and and I promise that, in return, I will do X. (If you're interested in this with Perl, try Class::Contract by Damian Conway).

Some people would argue that this violates the spirit of TIMTOWTDI. No, it doesn't. Nothing in the contract makes any guarantees how I implement my solution. Nothing in the contract specifies how you have do what you want to do. It merely ensures that when different portions of a program interact, they do so on a very, very predictable basis. Imagine working on a huge system where you have many virtually identical object methods that take a scalar, or an array, or an array ref, or a hash_ref using named parameters but won't accept a hash, etc. What interface do you use? And what if the data structure returned seems to be arbitrary? I've worked on systems that do that and they are Not Fun.

One of the best examples of this problem is the Web browser. Netscape won't render tables that leave off a closing table tag. This is a Good Thing. Internet Explorer has once again tossed standards out the Window (hah!) and decided that you, the developer, don't know how to send a correct Content-Type. Instead, IE examines the beginning of your entity body and it decides how to render it. Have you ever wanted to send HTML as plain text? That's a non-trivial task if you're using IE. More than once, I've sent an "image/gif" that was actually a jpeg and tried to figure out why IE rendered the image and Netscape didn't. IE tries to hold the programmer's hand and, as a result, takes away fine-grained control -- though it's still one of the best browsers on the market.

How could these problems have been avoided? W3C standards are, in effect, a variant of Design by Contract. You know what you have to send and, in return, you know exactly what to expect. I've heard so many Perl programmers moan and groan that standards aren't being adhered to, but fail to appreciate that what products like IE are trying to do is be liberal in what they accept.

So, I think Larry is right in the sense that being liberal in what to expect is a worthwhile idea. Very few people can get it right, though, which is why I think Larry is wrong.

Cheers,
Ovid

Update:: I think my point was not as clear as I thought. As I said below in a reply to tye, it's "not that we should be so rigid that we break with the slightest gust of wind. My point is that we shouldn't be so flexible that we fall over." It's that middle ground that most fail to find, so I'd rather they err on the side of caution :)

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: Larry vs. Joel vs. Ovid
by Masem (Monsignor) on Nov 21, 2001 at 00:07 UTC
    The web browser is an excellent example; the problem started, however, before NS was even NS -- the initial versions of Mosiac offered no clues to the user that the web page was malformed, and that trend has continued since. Is this necessarily a good idea? It's a big if, of course, but I would suspect that if my browser told me every time I encounted a page that didn't meet W3C standards, I'd be very very tired of getting that warning nearly all the time. From a UI stand-point, some would argue that not informing the user of a bad layout in a web page is a good thing (You should only tell the user that there's a serious error if they are supposed to take action to do something about it). But here, the problem was that there was no way of using the browser alone to determine bad makeup.

    This lead to the development of the first generation of WYSIWYG tools that did produce bad layout as well. These became popular, so the second generation of browsers decided to make sure that they could accept those. Begin vicious cycle.

    Nowadays, we're still stuck in the HTML problem, and I don't think that will ever be cured. But for so-called web application or interprocess communications, we have XML. There is *no* reason not to be able to write valid XML given the large number of tools, free, commercial, compiled, interpreted, or otherwise, out there; nor is there any reason not to read in only strict XML. However, the advantage of XML as a format is that one can include excess or alternative data to the format, which client A might not completely understand but can still produce an intended result, while client B can take advantage of. But the key point here is that the XML *must* be well-formed. If one gets bad XML, what should the error be the user? Hopefully, those that have worked with HTML over the years understand that malformed XML is potentally more dangerous than malformed HTML, and will take steps to tell the user of such.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

        The web browser is an excellent example; the problem started, however, before NS was even NS -- the initial versions of Mosiac offered no clues to the user that the web page was malformed, and that trend has continued since. Is this necessarily a good idea? It's a big if, of course, but I would suspect that if my browser told me every time I encounted a page that didn't meet W3C standards, I'd be very very tired of getting that warning nearly all the time. From a UI stand-point, some would argue that not informing the user of a bad layout in a web page is a good thing (You should only tell the user that there's a serious error if they are supposed to take action to do something about it). But here, the problem was that there was no way of using the browser alone to determine bad makeup.

      I wouldn't want the browser to pop up with a big freaky modal dialog every time I loaded a page with somewhat suspect HTML, but I would like to see some indication that the page isn't well-formed -- perhaps a message in the ubiquitous status bar to the effect of "This page is not valid HTML, and may not be displayed properly". As a user, I'd like to be told if the page I'm looking at is garbled (often one can tell immediately if a page hasn't rendered the way the designer would have liked, but I think it's reasonable to believe that some pages would be mis-rendered with subtle, important errors), and as a developer I'd sure like to know. (Of course, as a developer, I have plenty of HTML validators at my fingertips.)

      And bringing this back on topic... this point generalizes fairly well to programming in general. That's why compilers have warnings as well as error messages. I see no reason why all programs, especially those that talk to people, should insist on conservatively correct input. If you get suspicious input, emit a warning and do the best you can. That way, if you're relying on vaguely bogus input from someone else's software (for instance), you can still get work done. The difference between this model and the web browser problem is that Our Favourite Web Browsers(tm) don't give any (easily accessible) warnings about malformed HTML, so Joe Luser has no idea that they've just written awful markup.

      That said, within your own code strict adherance to contracts (for instance) is an excellent idea. If you're generating bogus data, you're going to want to know about it, not fudge it and hope for the best, and it's much more difficult to ignore a confess than a carp.

      --
      :wq
        I would like to see some indication that the page isn't well-formed
        Yes, and iCab does that, by putting a little smiley-face/frowny-face icon on the location panel. Of course, it's always frownyface on PerlMonks, so you can press it to get the errors. Here are the errors for this page as I type this:
        http://www.perlmonks.org/index.pl?title=%28FoxUni%29%20Re%282%29%3A%20 +Larry%20vs.%20Joel%20vs.%20Ovid%20vs.%20Masem%20vs.%20Web%20browsers& +parent=126653&lastnode_id=126653&node=Offer%20your%20reply&parent_nod +e=126653 Altogether 79 errors found. Only 25 errors are listed below. Error (9/4): The tag <layer> is not part of HTML 4.0. Error (9/93): The end tag </layer> is not part of HTML. Warning (9/162): In the tag <TD> the attribute "WIDTH" should only con +tain absolute pixel values. Error (9/285): In the tag <IFRAME> the attribute "FRAMESPACING" is not + allowed. Warning (9/606): In the tag <TD> the attribute "WIDTH" should only con +tain absolute pixel values. Warning (9/665): In the tag <TD> the value of the attribute "WIDTH" mu +st be enclosed in quotes. Warning (9/665): In the tag <TD> the attribute "WIDTH" should only con +tain absolute pixel values. Error (10/166): In the tag <INPUT> the attribute "BORDER" is not allow +ed. Warning (16/106): The tag <FONT> should no longer be used since HTML 4 +.0. Error (16/120): The character '&' must be written as '&amp;'. Error (16/120): The character '&' must be written as '&amp;'. Error (16/200): The character '&' must be written as '&amp;'. Error (16/263): The character '&' must be written as '&amp;'. Error (16/1175): The character '&' must be written as '&amp;'. Warning (21/1): The tag <CENTER> should no longer be used since HTML 4 +.0. Warning (24/3): In the tag <TD> the attribute "WIDTH" should only cont +ain absolute pixel values. Error (27/78): The character '&' must be written as '&amp;'. Warning (84/3): In the tag <TD> the attribute "WIDTH" should only cont +ain absolute pixel values. Error (85/32): The color name "eedddd" is not valid. Error (86/1): In the tag <TR> white space is missing as separator afte +r the attribute "BGCOLOR". Error (86/1): In the tag <TR> the value of attribute "BGCOLOR" is miss +ing. Error (86/1): The attribute "000000" is not part of HTML. Error (86/1): In the tag <TR> the value of attribute "" is missing. Warning (88/8): The tag <FONT> should no longer be used since HTML 4.0 +. Warning (95/4): The tag <FONT> should no longer be used since HTML 4.0 +. Error (102/1): In the tag <TR> white space is missing as separator aft +er the attribute "BGCOLOR". Error (102/1): In the tag <TR> the value of attribute "BGCOLOR" is mis +sing. Error (102/1): The attribute "000000" is not part of HTML. Error (102/1): In the tag <TR> the value of attribute "" is missing. Warning (104/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (111/4): The tag <FONT> should no longer be used since HTML 4. +0. Error (112/13): The character '&' must be written as '&amp;'. Warning (112/79): The tag <FONT> should no longer be used since HTML 4 +.0. Warning (113/1): The tag <FONT> should no longer be used since HTML 4. +0. Error (114/286): The character '&' must be written as '&amp;'. Error (121/1): In the tag <TR> white space is missing as separator aft +er the attribute "BGCOLOR". Error (121/1): In the tag <TR> the value of attribute "BGCOLOR" is mis +sing. Error (121/1): The attribute "000000" is not part of HTML. Error (121/1): In the tag <TR> the value of attribute "" is missing. Warning (123/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (133/1): The tag <FONT> should no longer be used since HTML 4. +0. Warning (135/254): The tag <FONT> should no longer be used since HTML +4.0. Warning (135/1488): The tag <FONT> should no longer be used since HTML + 4.0. Warning (147/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (154/4): The tag <FONT> should no longer be used since HTML 4. +0. Warning (164/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (168/31): The tag <FONT> should no longer be used since HTML 4 +.0. Warning (168/148): The tag <FONT> should no longer be used since HTML +4.0. Warning (168/277): The tag <FONT> should no longer be used since HTML +4.0. Warning (168/438): The tag <FONT> should no longer be used since HTML +4.0.
        I'm still trying to work with the Everything-engine people to get them to put the right ampersand escaping in URLs. Most of the rest of that goes away if you start using CSS instead of explicit tags. But there's still the odd things, like unquoted parameters, for which all we can say is "sloppy coding, please fix up!".

        -- Randal L. Schwartz, Perl hacker

        I was going to comment on this to tye's post, but it's just as valid here.

        "What if" Mosaic 0.9 had a pop up dialog that warned of invalid HTML, from day one? Where would we be now? Let me extrapolate:

        Since it would be expected that web page designers would check their own pages using the first generation browsers, they would early on discover their page errors and fix them. The typical end user would have never seen these errors save from sloppy HTML writers that didn't test.

        When the first generation HTML editors would be introduced, they would be careful to make sure that they produced clean, valid HTML code as to make less work on the end writer to clean up this code before it was put on the web.

        From that point on, you'd get the same circle of dependacy as we had in reality, but this time with adherence on strictness. All HTML that would be published today, save by those that lack any QA, would be clean and well-formed...

        ...however, there is the Microsoft factor to consider here. It can be easily suggested that MS would have been to first to disable this pop up dialog to an option that could be turned off after the first instance, or disabled it completely. The implications of this are hard to determine; it could have speed up their 'domination' of the web by offering a solution that allowed 'bad' code through mostly unnoticably, or it could have caused them to be shuned by the community for trying to hide bad code. But once someone did that, others would have followed, and we might be back exactly where we are today, save for the lack of some HTML atrocities like BLINK and FRAME.

        In the today of this history, we would have never accepted that pop-up message during causal browsing, but if history was slightly different, we may have been upset to not find it there when it was needed.

        That's why I refer to the strength of XML; we as a collective computer community are not simply looking at XML as tag soup as HTML was originally, but as a well-structured document in terms of opening and closing tags with attributes. Assuming that you can properly write out this format (which is easy) and read in this format without accepting flaws (difficult, but people have put solutions in place for this already, no need to reinvent said wheel), then the rest of XML which allows for free format of data items is in place. And people do realize that, and are making sure that while they may be sending XML documents that have extra or lacking data, the XML is well formed and does not fail in parsing. This is a very good first step in more adaptable and usable data formats.

        -----------------------------------------------------
        Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
        "I can see my house from here!"
        It's not what you know, but knowing how to find it if you don't know that's important

Re (tilly) 1: Larry vs. Joel vs. Ovid
by tilly (Archbishop) on Nov 21, 2001 at 02:44 UTC
    And I think that Joel is being insufficiently generous in what he accepts here.

    What Larry stated is a commonsense and correct rule. A given program will be most effective in communicating if it is generous in what it accepts and strict in what it emits. You may prefer Netscape's breaking on certain constructs. But no matter what you prefer, it is a fact that Internet Explorer is better at communicating than Netscape is. Because it is more generous in what it accepts, and goes out of its way in what it emits to do things like work around broken browser version checks.

    What you and Joel are saying is that communicating well is not always wanted! If I have barely learned another language, when I try to communicate I darned well want to deal with someone who will hear through my accent, mangled grammar, and know to point me to the bathroom now. If I am trying to learn how to be understood, I want that person to point out my mistakes, correct me, and be very strict in what they call acceptable.

    So Larry is right. Browsers are great communicators. They bend over backwards to communicate. And that fact helped millions of novices, who are scared of their computers, to reach out and successfully communicate with each other. Had HTML been strict by default, I strongly doubt that that would have happened.

    In so far as Joel is talking about something different than what Larry is, he is also right. Having all of these great communicating programs around has spoiled people. They succeed in communicating with the most popular browsers even though they are communicating really badly, and so have learned to be bad communicators. A world full of bad communicators is hard to understand.

    But as long as Joel is talking about what Larry is, he is just plain and simply wrong. A program that refuses to implement simple workarounds for mistakes that you know are out there is worse at communicating. Period. There are cases where it fails to communicate, even though communication is demonstrably possible. Whether or not improving your program's ability to communicate leads to better communication is a different issue. There are tradeoffs both ways. And certainly if you are developing software, it is best to find out errors as quickly as possible. (Which is why I explicitly tell Perl with strict.pm to not try to be generous in figuring out what I meant to say...)

    And IMO criticizing Larry Wall for a point he didn't make is just plain unfair.

      tilly wrote:

      And IMO criticizing Larry Wall for a point he didn't make is just plain unfair.

      Mea Culpa :( I realize now that the root post came across differently from my intentions. I think Larry has an excellent idea. I just don't think most people (and that includes myself) have the skill to implement it.

      I think, though, that your 'bathroom' analogy illustrates the difference between humans and computer (and demonstrates the breakdown of the analogy). A human is much better at interpreting meaning than a computer, but we still get things wrong. If I say "I want to hit the man with the shoe", technically, that means "There is a man with a shoe and I want to hit him." Many people, though, would incorrectly interpret this as "I have a shoe and I want to hit the man with it." That's what I mean when I think that people are often bad at liberally accepting input.

      Now, if I feed that same sentence in a computer that has algorithms to understand English grammar, it may parse the grammar correctly, but get my intent wrong if I really did mean that I have a shoe that I want to hit someone with. How can a computer determine intent if the sentence fits the grammatical rules but doesn't mean what I meant? The more liberal we get, the more we open the room for ambiguity. The more paths a program can take to get to a solution, the more paths there are for bugs.

      This is all about finding a good middle ground. That's tough to do.

      Cheers,
      Ovid

      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Re: Larry vs. Joel vs. Ovid
by footpad (Abbot) on Nov 21, 2001 at 03:35 UTC

    Hm. It's possible we interpret Larry's comment differently. Here's how I paraphrase it: "Forgive liberally; act responsibly. Communicate!"

    I base that on a larger reading of the quote in question:

    Of course, in Perl culture, almost nothing is prohibited. My feeling is that the rest of the world already has plenty of perfectly good prohibitions, so why invent more? That applies not just to programming, but also to interpersonal relationships, by the way. I have upon more than one occasion been requested to eject someone from the Perl community, generally for being offensive in some fashion or other. So far I have consistently refused. I believe this is the right policy. At least, it's worked so far, on a practical level. Either the offensive person has left eventually of their own accord, or they've settled down and learned to deal with others more constructively. It's odd. People understand instinctively that the best way for computer programs to communicate with each other is for each of the them to be strict in what they emit, and liberal in what they accept. The odd thing is that people themselves are not willing to be strict in how they speak and liberal in how they listen. You'd think that would also be obvious. Instead, we're taught to express ourselves.

    I think my paraphrase fits the original context of this and other remarks by Larry.

    It also suggests, I think, a rather simple solution to the "HTML problem." If browsers implemented a small indicator (or a toolbar button) that changed state when faced with malformed HTML (CSS, XML, or whatever), the person browsing the content could trigger a report of the problems. Something like this would be nice:

    Missing </TABLE> tag. Image (filename.gif) appears to be a JPG file. <B> tags are deprecated, use CSS formatting instead. Etc...

    (It would be even nicer if the report could then be emailed to someone, such as the contact person of the site in question.)

    Update for no_slogan: OTOH, it just might be impetus to fix the problems, donch'a think? Still, I was thinking of the idea for the people designing the site, then previewing the results. Given that, we could dispense with the email thing...just so long as the report is easily saveable to text.

    This approach handles the problems that tye wants to prevent, provides an active solution, and lends itself to handling other problems that you (and Joel) mentioned. Consider:

    • When provided different interfaces, choose the one that best fits the specific context.

      If none exists, then write wrappers to convert what you have to what you need...and then alert the author/maintainer to the discrepancy.

    • When presented with incomplete (or incorrect) input, recover if you can or fail if you can't.

      If you fail, however, provide clear and easy-to-access information describing the failure and the solution to the problem.

    • Instead of changing the behavior of existing interfaces quietly, document those changes as clearly and broadly as possible or (preferably) create a new interface that offers the new behavior while calling the original. This lets you deprecate the original behavior while allowing existing code to run until it can be refit.

    • Instead of creating user-hostile tools that throw preferences, character sets, locale issues, and other problems commonly associated with international programming, create software tools that make it easy for the casual or untrained developer to deal with these and other common problems. IOW, make your libraries, modules, tools, componenets, and so forth as aware--and extensible--as possible. (While I agree with the basic sentiment that Joel was ranting about, I disagree with his conclusion. I prefer to test things heavily.)

      Similarly, provide ways to test for commonly encountered problems and specific information on resolving them. Don't force everyone else to claw their way up the same learning cliff that you had to.

    • Given a specification (or contract, if you will) that doesn't specify what happens in a corner case, get that clarified by the contract sponsors. This would have been reasonably easy for the Netscape team to have done at the time and would have prevented, I think, a lot of the grief we're dealing with today.

    • When encountering code that erroneously relies on undocumented behavior, shouldn't you simply treat it as a bug and fix it, rather than blaming someone else for the problem? (It's not Larry's fault that some 'softie changed a Windows API call. Nor is it his fault that one of your guys relied on undocumented behavior.)

    • When you find a bug, fix it. (That's also part of the contract.) This includes legacy bugs, such as CSS Level 1 support (or lack thereof).

    • When faced with tools that do not satisfy their contracts, contact the responsible parties and see if they'll fix the problems. If they don't, then either find a workaround or use different tools. Period. Life's too short to worry about broken promises.

    I do agree that Design By Contract is a Very Good Thing®; however, part of the contract is letting others know when the contract has been broken. (For more on the subject, I highly recommend The Pragmatic Programmer, which devotes a major section to the subject and also provides an interesting variation of Larry's comment.)

    In short, I think Larry's spot on with his comment. We should forgive liberally, act responsibly, *and* communicate...just be nice about it. (Remember the last rule of perlstyle.)

    Just because we're not good at something doesn't mean we should stop trying to practice it. There is, after all, only one way to get to Carnegie Hall.

    --f

      It would be even nicer if the report could then be emailed to someone, such as the contact person of the site in question.
      This isn't going to scale for busy web sites. I doubt if the contact person (or the mail system administrator) is going to be happy when he comes in to work one morning and finds a couple of hundred thousand e-mails all saying, "your html is broken!"
Re: Larry vs. Joel vs. Ovid
by chipmunk (Parson) on Nov 21, 2001 at 01:59 UTC
    I just want to make two quick comments:
    Imagine working on a huge system where you have many virtually identical object methods that take a scalar, or an array, or an array ref, or a hash_ref using named parameters but won't accept a hash, etc. What interface do you use? And what if the data structure returned seems to be arbitrary? I've worked on systems that do that and they are Not Fun.
    Use whichever interface you prefer! :)

    If the data structure returned is arbitrary, then the system is not being strict in what it emits. That's not the side of the contract that you're discussing.

Re: Larry vs. Joel vs. Ovid
by no_slogan (Deacon) on Nov 21, 2001 at 01:29 UTC
    There's always a question of how far to go with something like "being liberal in what you accept." Yes, it would be nice if everybody implemented the standards correctly. However, in the real world, standards are complex and changing. You often see one version of a protocol that says "this field is reserved and must be set to zero", while a later version defines valid nonzero values for that field. If older implementations simply ignore the reserved field, then later versions of the protocol can be designed so that they still interoperate with the old software. However, if the author of the old software followed your advice to be maximally strict, then you get breakage that was completely avoidable. Case in point, TCP Explicit Congestion Notificiation. I've got a non-ECN router between me and www.sun.com which drops my packets right on the floor. Irritating.

    With your IE example, it sounds like they're going too far in being liberal. If there was no MIME type given, or it was one the browser didn't understand, guessing the type from the content would seem reasonable. But ignoring a valid MIME type is just too much.

Re: Larry vs. Joel vs. Ovid
by clemburg (Curate) on Nov 21, 2001 at 17:52 UTC

    In your post, you confuse "being liberal" with "not being explicit". You can be liberal, but explicit if you put enough intelligence in the explicit description.

    E.g., COBOL is a very non-liberal language, while ANSI C is a fairly liberal language in comparison, while still being perfectly explicit about what it will accept. Or take JCL, the job command language on most mainframes, and compare it with the POSIX standard "sh" shell implementation. Both are explicit, but the "sh" shell is much more liberal in what it accepts.

    If talking about languages, being liberal while being strict is often called "expressive".

    Christian Lemburg
    Brainbench MVP for Perl
    http://www.brainbench.com

Re: Larry vs. Joel vs. Ovid
by belg4mit (Prior) on Nov 21, 2001 at 04:27 UTC
    "The rigid oak may topple to a gust of wind but the flexible reed still stands."
    ...or something. Actually:
    The Oak and the Reed from Aesop's Fables
    
         The Oak spoke one day to the Reed
         "You have good reason to complain;
         A Wren for you is a load indeed;
         The smallest wind bends you in twain.
         You are forced to bend your head;
         While my crown faces the plains
         And not content to block the sun
         Braves the efforts of the rains.
         What for you is a North Wind is for me but a zephyr.
         Were you to grow within my shade
         Which covers the whole neighbourhood
         You'd have no reason to be afraid
         For I would keep you from the storm.
         Instead you usually grow
         In places humid, where the winds doth blow.
         Nature to thee hath been unkind."
         "Your compassion", replied the Reed
         "Shows a noble character indeed;
         But do not worry: the winds for me
         Are much less dangerous than for thee;
         I bend, not break.  You have 'til now
         Resisted their great force unbowed,
         But beware.
         As he said these very words
         A violent angry storm arose.
         The tree held strong; the Reed he bent.
         The wind redoubled and did not relent,
         Until finally it uprooted the poor Oak
         Whose head had been in the heavens
         And roots among the dead folk.
    

    Explicitly in reply to the UPDATE.

    --
    perl -p -e "s/(?:\w);([st])/'\$1/mg"

Re: Larry vs. Joel vs. Ovid
by dragonchild (Archbishop) on Nov 21, 2001 at 00:09 UTC
    Good node. ++, Ovid.

    However, what do you propose? You're saying that the paradigm by which a (seemingly) large number of prominent applications do their business is flawed. Yet, you give no possible counter-solution.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      Well, of course I do, silly :) Be strict in what you emit and accept. Masem made clearer the browser problem. If it was agreed from day one that Web browsers weren't going to put up with sloppy HTML, then we wouldn't have the mess we have today. XHTML is a great idea which might get around this, but HTML is pretty lousy.

      In general programming terms, this means laying out clear standards with a very narrow interpretation. Oh, you can interpret your client's needs all you want and you should do your best to meet them, but give them 27 ways to solve the same problem and they are going to find bugs in your code. Give them one way and and you're much happier. But make sure you solve their problem.

      That's not to say that you shouldn't have multiple methods of dealing with a programming problem. It's saying that you really, really need to look at the problem carefully and determine if those multiple methods are appropriate. On Windows, you can copy selected text with File->Copy, Cntl-C, or a right click to pull up a context menu. That seems fine because different people have different methods of learning and it's appropriate to cater to those. Internally, however, they had all better call one copy routine with identical data formats.

      Now, how can you guarantee that all of those methods of copying text, images, or whatever are going to do exactly what you want? That's the tough part. Frankly, nothing irritates me more than copying text from a Web page, pasting it into an email and seeing a frickin' radio button show up. That even happens when I'm sending the email in PLAIN TEXT!!! That ain't a feature, in my book. But that's the result of being liberal in what to accept. How can you be strict in what you emit if you aren't sure at any given moment what you've accepted?

      There's a time and place to be liberal in what to accept, but very few get it right. It's a good theory that most should put on an ivory tower and leave there. The more complex the system, the less likely that such "liberality", if you will, is going to please every one. If you're writing a mass-market tool like an OS, maybe this is necessary. If you're writing a financial application, do you want to be responsible for explaining to the client that they "misplaced" $30,000 because you were liberal with their Accounts Receivable? :)

      Cheers,
      Ovid

      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        If it was agreed from day one that Web browsers weren't going to put up with sloppy HTML, then we wouldn't have the mess we have today.

        And we might not have the popularity of the web we have today. Graceful failure is a good idea. Under your scheme all of the poor surfers of the world would be getting lots of errors from their browsers when those surfers have no control (and virtually no influence) over getting those errors fixed and many of them would have no idea what those errors mean.

        The fact is that many decided to take advantage of one half of a sound principle by completely ignoring the other half of the principle. And only having one half of that principle is certainly not a good idea.

        But putting the error checking and enforcement into a web browser client is a really bad idea. Error checking and enforcement needs to be as early in the process as possible. A much better idea would be to fix web servers so that they refuse to emit bad HTML rather than "fix" web browsers so that they refuse to display bad HTML!

        It is a good idea to be at least somewhat lenient in what you accept. It is also a good idea to note cases where you were lenient in such a way that this information is likely to make its way back to the source. That back channel is often quite difficult to implement well. But a really horrible idea for such an implementation would be browsers giving errors to web surfers.

        Now, being vague about what you accept is a bad idea. That is why the two halves need to go together. In order to be strict about what you provide, you have to define things clearly. Without that side of the coin, just being liberal about what you accept is trying to DWIM, which is (often) a nice feature of Perl but is usually a bad idea in any software that isn't interacting directly with a human.

        And do you want to be responsible for explaining to the client that the $30,000 transaction couldn't go through because something changed how much whitespace is sent?

        You can go too far in either direction. The idea of the principle is to prevent you from going too far in one direction. You are (in part) complaining about people going too far in the other.

        But the complaint about browsers being too lenient is a bad form of wishful thinking. No one will ever produce a browser that is not lenient in what it accepts -- well, if they do, it won't get used much. Now a browser that tells you that it had to be lenient has its advantages, all of which are of no use (and some distraction) to the average web surfer and so will never become widely popular. And having some magical control to force all browsers to not be lenient or to at least complain loudly when they are force to be is, in fact, a very bad idea, unless you also design a way for those browsers to direct their loud complaints back at the authors of the bad HTML.

        Go design that back channel first and then you can think about tilting against the "bad, lenient browser" windmill again.

                - tye (but my friends call me "Tye")
Re: Larry vs. Joel vs. Ovid
by mr_mischief (Monsignor) on Nov 22, 2001 at 01:17 UTC
    When a program is liberal in what it accepts, that keeps it from breaking when someone else's program is liberal in what it emits.

    When a program is strict in what it emits, it keeps another program from breaking by being strict in what it accepts.

    A good example is any of the text-based protocols built on top of TCP, such as Telnet, SMTP, or HTTP. The standard calls for a line ending of CR/LF. Some servers or clients break if they don't get this. Others will break them by issuing just an LF or just a CR. If, however, a program accepts either CR/LF or a bare LF and always emits the proper CR/LF, then it will communicate with all properly implemented software and much of the quick and dirty hacks that may have broken another piece of software on the other end. This is similar to what Larry is talking about when he says programs should communicate this way -- with other programs.

    Communication within a program is another story. You might prefer strict contracts, because you control the whole structure and you can ensure that all parts adhere to that internally. This is different from interacting with unknown software on another system that might have been designed by a different team in a different country in a different decade.

    How successful would Microsoft Excel be in the spreadsheet market if it could not import spreadsheets from what was once the de facto standard, Lotus 1-2-3? How useful would the Unix shell command grep be if it limited you to searching 80-character lines?

    How about a cut command that only works with its default value of tab-delimited fields instead of also allowing space or comma delimited fields or fixed-position character counts? Sure, cut is strict in its interpretation of an input file according to its switches, but its switches allow it to accept more than one type of input file. Otherwise, cut would be pretty useless unless you had a different program for each type of input. How about if it could only support disk files and not STDIN redirects to it? Yep, being liberal in what it accepts helps there too. Perl's regex engine, split(), and other tools form a way to accept many, many different types of input from a file in the same program easily. This is a good thing.

    There are editors which only allow 80 character lines and only allow 64 kilobytes of text in a file. They are generally considered useless. This is another place that liberal acceptance of input is a good idea.

    There's a rule for applications programmers (which more and more people are accepting as rules for systems programmers, too) that says you always want to offer the user (whether that user be a persoon or an automated process) zero, one, or an infinite number of something. This is a form of being liberal in what you accept. If there's one of something because it's a special case (one driver bound to a port, for example, or one shared memory segment per process if you wanted to do something like that in your OS), then that's just the one. If there's more than one, then it's not a special case, and there should be a way to offer an infinite number of the same to the user. This works out well with the unlimited number of variables in modern languages, the unlimited size of a file an editor can handle, etc. Of course, there are still some exceptions to this rule, such as practical limitations due to the size of an integer and the fact that using arbitrary precision math in something like pointer arithmetic or file position hurts performance. Still, it's good to accept a user's wish to not limit a user to some arbitrary artificial limit on input, objects, processes, or whatever unless it's for performance or security reasons.

    Many of the GNU command-line tools accept BSD-style or SVr4 style arguments. This is a good thing. Some of them provide output in either format, but only when given an explicit argument to do so. They either emit their own format, or they emit the format requested. They are usually pretty strict once the choise is made. This is a good thing.

    I'm in the middle of switching an ISP from sendmail to Postfix. Postfix can use many of Sendmail's files and file formats, but it can also use Qmail's. This is a great thing for me. I'll be moving a few different POP servers with names overlapping among the differing boxes onto one killer box, which QPopper can't handle. It's a good thing Teapop can handle the traditional Unix mail spool files even though Qmail's one-file-per-message system is now more accepted. IT may keep me from having to try to convert several thousand users' email into another format. I'll also be able to use system the password file, MySQL, PostgreSQL, htaccess files, db files, or flat files for user lists for the mail system. That's a good thing. I can't put overlapping names in the same system password file. I'll probably use htaccess files, one per domain. The flexibility of what Postfix and Teapop accept as input from the server side make them great tools for this project.

    I'm sure I could give examples of liberal _acceptance_ of input being a good thing all day long. It seems to me that your issue lies more in the realm of liberal _interpretation_ of that input. Even in HTML, there's a difference between a browser ignoring a tag that's not understood and trying to render HTML that's just plain wrong. One allows for the expansion of the standard, and one is nonstandard. The former, though, is being liberal in what it accepts as input, as opposed to throwing an error saying something like 'invalid tag at line xxx'. Even allowing well formed, valid HTML 3.0 and well formed, valid HTML 4.01strict to be read and rendered by the same browser is being liberal in what the browser _accepts_. It still could complain if either page is badly formed. Being backwards-compatible is even a big part of some standards. C99 tries to break as little of C89 as possible. C++ standars attempt to allow most C to compile in a C++ compiler. Some ANSI C compilers have a K&R mode, a strict ANSI mode, and an ANSI with extra library functions allowed mode so a programmer can use whatever feels most comfortable. That's not to say that syntactically broken K&R code should work in K&R mode, or that syntactically broken ANSI code should work in the ANSI or ANSI+more modes.

    I think the main issue here is one of a broad interpretation of 'accept'. I don't try to speak for Larry, but it's my understanding that he meant something less ambiguous than he is being taken to mean. After all, he's talking about being strict in what he emits. I think he's intending a conservative connotation for 'accept'. That's why I distinguish in this node between 'accept' and 'interpret'. All in all, I think you'll find that several of the other nodes in this thread are authored by those who read that quote from Larry the same way I do.

    mr_mischief
Re: Larry vs. Joel vs. Ovid
by perrin (Chancellor) on Nov 21, 2001 at 02:22 UTC
    Here's an example of a nice implementation of liberal input acceptance in perl (no offense, conservatives): the magic <> filehandle that is created from the command-line args which will be either STDIN or the file named on the command-line. It's really handy for quick scripts, and if you don't want ambiguous behavior in a large project, you can just not use it.
Re: Larry vs. Joel vs. Ovid
by strredwolf (Chaplain) on Nov 21, 2001 at 05:57 UTC
    Not only is the browser the problem, but the web server is. Remember back when PNG's came to town. You didn't have every webserver add image/png automatically. You didn't have every sysadmin find out about it. It took a while, and there's some out there that don't have that. Right now, BZIP2 format isn't being recognized, and is sent application/octet-stream or, even worse, text/plain! (Try pulling a Kernel, and it spews all over your screen)

    IE's being mistrusting of the webserver, rightfully so. But then it hides that fact, not letting programmers know about it. Naughty IE!

    --
    $Stalag99{"URL"}="http://stalag99.keenspace.com";

Re: Larry vs. Joel vs. Ovid
by andye (Curate) on Nov 21, 2001 at 16:17 UTC
    ++Ovid, interesting post.

    I have to pick you up, though, when you say that W3C standards are, in effect, a variant of Design by Contract. You know what you have to send and, in return, you know exactly what to expect. I don't think that this is true.

    Going back to the first HTML rfc (#1866) (and this is historic-interest-only) we see the usual RFC-style vocabulary:

    Here's part of the glossary:

    should If a document or user agent conflicts with this statement, undesirable results may occur in practice even though it conforms to this specification.
    So a browser could adhere to the spec without supporting every part of that spec. It's the same in most of the RFCs - there's shedloads of HTTP that hardly any applications support (e.g. PUT).

    To me this seems fairly widely divorced from the idea of 'Design by Contract'. Far from knowing what to send, and knowing what to expect back, it's more like knowing what you could send and what you might receive.

    The existing (stricter) standards have evolved from this permissive base - I wonder how much this permissiveness drove their widespread adoption?

    andy.

Re: Larry vs. Joel vs. Ovid
by dws (Chancellor) on Nov 22, 2001 at 01:30 UTC
    Here is the full quote, from Wall's Second State of the Onion Address.
    Of course, in Perl culture, almost nothing is prohibited. My feeling is that the rest of the world already has plenty of perfectly good prohibitions, so why invent more? That applies not just to programming, but also to interpersonal relationships, by the way. I have upon more than one occasion been requested to eject someone from the Perl community, generally for being offensive in some fashion or other. So far I have consistently refused. I believe this is the right policy. At least, it's worked so far, on a practical level. Either the offensive person has left eventually of their own accord, or they've settled down and learned to deal with others more constructively. It's odd. People understand instinctively that the best way for computer programs to communicate with each other is for each of the them to be strict in what they emit, and liberal in what they accept. The odd thing is that people themselves are not willing to be strict in how they speak, and liberal in how they listen. You'd think that would also be obvious. Instead, we're taught to express ourselves.

    Bleaoghgh%$%$#@!!!

    You may feel much better afterwards, but consider the poor guy next to you with the ruptured eardrum. Ruptured eardrums should be prohibited.

    On the other hand, we try to encourage certain virtues in the Perl community. As the apostle Paul points out, nobody makes laws against love, joy, peace, patience, kindness, goodness, gentleness, meekness or self-control. So rather than concentrating on forbidding evil, let's concentrate on promoting good.

    It seems to me that Larry is talking about how a then-popular viewpoint applied to interpersonal relationships within the Perl community, and that Joel took Larry's quote conveniently out-of-context. Perhaps he pulled the quote out of some random quote file.

Re: Larry vs. Joel vs. Ovid
by thraxil (Prior) on Nov 21, 2001 at 06:46 UTC

    i think they're both right.

    larry's point is quite valid when you're dealing with just two interacting components. particularly if one of those components is a human (you're asking for trouble if you give a human a text box for input and say something like "just don't put any semicolons in there or it will break things") or otherwise not under your control.

    joel simply points out that it doesn't scale very well to systems of more than two components.

    anders pearson

Re: Larry vs. Joel vs. Ovid
by mr_mischief (Monsignor) on Nov 28, 2001 at 08:08 UTC
    Let's remember that Larry wasn't the first to say this, which is part of why he says it's understood instinctively.

    John Postel writes in Internet RFC 793, defining the TCP protocol:

        2.10.  Robustness Principle
    
        TCP implementations will follow a general principle
        of robustness:  be conservative in what you do, be
        liberal in what you accept from others.
    


    This is a document from September 1981, and lays one of the cornerstones for the modern Internet. If this was not laid forth in that document, TCP/IP may not have become any more standard than DECNet or AppleTalk.
Re: Larry vs. Joel vs. Ovid
by princepawn (Parson) on Nov 23, 2001 at 00:54 UTC
    It's funny. I was reading Bertrand Meyer "Object-Oriented Software Construction" recently and he certainly emphasize design by contract.

    I had worked up a post on this same subject, but did not feel I had it shaped up enought for posting.

    Anyway, do not forget Carp::Datum as this module also support design by contract but in a different way than Class::Contract

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://126596]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (4)
As of 2024-04-19 03:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found