Programmers are a multi-faceted beast. We write instructions for the dumbest person to follow, but that's only the tip of the iceberg. We listen to users (a group who never knows what they want), we placate managers (a group who wants everything right now), and a host of other skills. But, most of all, we solve problems. brian_d_foy has posted a great discussion of how to solve problems, as have other monks. Here's mine, using as an example how I found and fixed a bug last night.

The problem: I am using DBIx::MyServer to simulate a MySQL database for a project at work. We need to be able to connect to it from Perl, Java, and C++. Perl and C++ were having no problems. Java, through JDBC, was just giving us fits. It didn't seem to be using the right protocol, which made no sense.

  1. If something doesn't make sense, you're not seeing something.

    I know next to nothing about Java or JDBC, but as the author of our MySQL emulator, I'm the one who has to solve it. I grab a copy of the Java code from SVN and try to run it. Of course, I immediately run into problems because the classpath is completely wrong.

  2. Replicate the bug in your environment.

    A few go-rounds with the Java developer and he finally gives me a line to run on my Mac. Yep, I see the problem.

  3. Find and ask an expert.

    This is where OSS really shines - I can go email the author of the module. He replies pretty quickly. Unfortunately, the reply is "I never tried JDBC. But, if you get me a minimal testcase, I'll take a look."

  4. Build a minimal testcase.

    Ah-ha! The next step is to build a minimal testcase. There are several reasons CPAN authors ask for them. The first is that it gives them the ability to replicate the problem. The second is (as we're about to see) the process of generating the minimal testcase can solve the problem.

    I start building the minimal testcase. The first thing was to remove all of our proprietary code. So, I created a Java class that creates a connection and reports success or failure, then exits. I also created a Perl script that creates a minimal DBIx::MyServer server (taken almost verbatim from the distro). It works.

  5. Start adding things to the minimal testcase until things break.

    Normally, people talk about removing stuff from the broken code till it starts workings. But, in this case, I had a working testcase, so I needed to add stuff till it breaks. I replaced the stock DBI call with the call to my emulator and it quickly broke.

  6. Verify what breaks.

    I took about 20 minutes to verify that the addition of that one line was what actually caused the breakage. Most people are so quick to say "Oh, I saw it once - that must be the problem!" and I'm just as bad as anyone. That time taken to truly verify the breaking piece was critical because it gave me another lead.

  7. Start drilling down to find the breaking bit.

    After some faffing about, I learned more about how MySQL negotiates a connection that I ever wanted to know. Namely that there's two protocols that are completely and utterly incompatible. The initial problem, if you recall, was that JDBC was using the older protocol even though DBIx::MyServer was requesting the newer one. JDBC, it turned out, apparently didn't care about what flags the server would set in its handshake. Instead, it preferred to rely on an arbitrary text string that my emulator wasn't setting properly. A quick hard-code and we're good.

  8. Assume that professionals know what they're doing, but don't count it as gospel.

    My JDBC connector came from MySQL itself. I'm sure they had a good reason for doing what they did, but it was certainly annoying. :-)


My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Replies are listed 'Best First'.
Re: Solving problems and fixing bugs
by Your Mother (Archbishop) on Mar 22, 2008 at 23:52 UTC
    If something doesn't make sense, you're not seeing something.

    I came to understand that one only after banging my head against it a dozen times. Usually it's a situation where the code appears to be perfect but won't run so I spend a few hours tweaking the perfect code into other equally valid versions, trying to force it to make sense, until I snap out of this moronic fog and check the config files or the environment or whatever and find that someone turned on something crazy somewhere or XYZ got silently upgraded to the wrong version or whatever.

    Seems stupid but once I finally realized that if it looks right but runs wrong, I almost never have to fix it, I just have look elsewhere. The amount of time I've saved lately with this is *slowly* adding up to the amount of time I've wasted in the past fighting against it.

      Amen to that. This is especially true if I'm using a language/framework I'm not super familiar with; I keep 2nd-guessing my syntax or rewriting stuff that was actually functioning in my search to locate the problem. This can be good for learning, but it's a bummer for beauty sleep (and heaven knows I could use more of that!)


      I'm a peripheral visionary... I can see into the future, but just way off to the side.

      Oh, Hubris, thy name is Yer Mamma. Yesterday I couldn't get one of my computers to send a file over wireless to another which had the right printer attached. So I tried the laptop instead. Then someone else's laptop. Then started fiddling with firewalls. Then started passing from A --> B to try to get to C. Then finally went and checked the unreachable computer after an hour of this. Its connection had dropped. L.

Re: Solving problems and fixing bugs
by ack (Deacon) on Mar 25, 2008 at 03:51 UTC
    If something doesn't make sense, you're not seeing something.

    Oh my goodness! I keep telling myself this very thing; then I run into a challenge, forget the lesson, beat myself up for days, and then have an epiphany of relearning the lessons and finally get down to solving the challenge.

    Build a minimal testcase.
    Start adding things to the minimal testcase until things break.
    Verify what breaks.

    Ah yes! This is the very foundation for how we try to be careful about building up our test suites. Start simple, add complexity little-by-little, till it 'breaks'. The more we follow this, the better our testing and the less the cost and time. Overly complex tests with no particular problem in mind is what our management wants; but it proves repeatedly to be excessively time consuming and costly. But here's the rub (as they...whoever 'they' are...say): Why is it so hard to figure out what the 'next thing to add to try to make it break' should be?

    Most people are so quick to say "Oh, I saw it once - that must be the problem!" and I'm just as bad as anyone. That time taken to truly verify the breaking piece was critical because it gave me another lead.

    I've spent many a long night troubleshooting this tester tendancy instead of troubleshooting what we should be troubleshooting...the actual system problem.

    The OP's wisdom is a nice summary that I, for one, find most satisfying.

    Thanks, dragonchild.

    ack Albuquerque, NM
Re: Solving problems and fixing bugs
by alpha (Scribe) on Mar 24, 2008 at 14:15 UTC
    Building a testcase can sometimes be way too difficult to begin with (go test a gtk+ app or something even more scary). Step 7 seems most important anyway.
      If you can't see your way to building a testcase, then your code was built wrong. GTK+ apps, to use your example, are testable using any number of scriptable GUI testers (such as Rational Rose or, on OSX, Automator). Webapps are similar testable using Selenium. Come back when your problem deals with threads and random data.

      My criteria for good software:
      1. Does it work?
      2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?