Once when I was six years old I saw a magnificent picture in a book, called True Stories from Nature, about the primeval forest. It was a picture of a boa constrictor in the act of swallowing an animal. ... In the book it said: "Boa constrictors swallow their prey whole, without chewing it. After that they are not able to move, and they sleep through the six months that they need for digestion." I pondered deeply, then, over the adventures of the jungle. And after some work with a colored pencil I succeeded in making my first drawing. My Drawing Number One. It looked something like this:
xxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I showed my masterpiece to the grown-ups, and asked them whether the drawing frightened them. But they answered: "Frighten? Why should any one be frightened by a hat?" My drawing was not a picture of a hat. It was a picture of a boa constrictor digesting an elephant. But since the grown-ups were not able to understand it,...

The Little Prince, Chapter 1 by Antoine de Saint-Exupéry

I must be the snake the little prince was thinking of because I seem to be in the habit of swallowing elephants. I often find myself needing to learn a complex system inside and out very quickly. Sometimes, it is because I have a new client. Especially in my early days I would often get invited to lead the design on a half finished project after a series of the big guys (Accenture, etc) had successfully botched the project. Until then, companies don't really look beyond the obvious and safe sources of consultants. More recently, I have either needed to evaluate 3rd party software systems for in-house use or even our own code for refurbishment. And then there are volunteer projects.

Over the course of time I've evolved a strategy for working through complex systems rather quickly. I won't say it isn't a lot of work (it is), but it keeps me from going in endless circles, so that at least the work moves me forward.

I'd be interested in knowing more about how other monks handle going about learning a new system quickly. Not everyone learns the same way. Also if you see any non sequiturs or an obvious omission, I'd appreciate the feedback. Writing something like this up feels a bit like trying to explain how to tie a shoe. It is all to easy to take for granted a crucial step and leave it out. I also have an ulterior motive. pmdevils are also in the position of trying to swallow elephants when they join.

The ten steps can be summarized (they will be explained in more depth later below):

  1. Gather together whatever documentation there is
  2. Experiment with the front end
  3. Study permanent data streams and persistant data
  4. Explore the class/type systems
  5. Understand where the code lives
  6. Scan and categorize the codebase
  7. Study back end CRUD operations
  8. Study front end CRUD operations
  9. Study permissions/security infrastructure
  10. Explore a well defined functional area to see how the system works in a specific case.

This is, of course, an interactive process and it often involves a lot of backtracking as well. Often the first pass through a step results in more questions than answers. Then I go onto another step, and find an answer to something I couldn't figure out in the previous step. If the answer is important enough, I may go back and redo all or part of the previous step.

Although the first step is "gather together documentation", the process describe below will work even if documentation is sketchy and you have to manually go through code and database schemas. Documentation makes the process much easier (if it is correct), but it isn't necessary to the learning process.

Figuring out how much detail to do at any one step of the process is a bit of an art. Usually I try to be as superficial as possible. For each stage I try to learn enough to categorize things in some sensible way that tells a story about data, behavior or the interaction between them. Then I move on to the next step. When the fog to understanding ratio gets out of hand, I backtrack one or two steps and go into more detail until the fog clears.

As I work through the above list I usually keep copious notes and organize them as I go. Writing out my answers to the questions at each step helps me identify unanswered questions. I also find I'm accumulating knowledge much faster than I can absorb it so the notes act as a memory bank, especially if they are well organized. I frequently organize and reorganize the notes as I go. The act of organizing them also helps me remember more. Finally, if the system is poorly documented, the notes become a first cut at improved documentation.

As one skims through the list, the first thing that might stand out is that a third of the steps are data centric, including the first detailed analysis step (study data streams and persistant data). For many programmers starting with data may seem counter intuitive. Programming is about making data do something, and code is where the action is. But data puts an upper bound on what the code can do so it provides a way of focusing attention and understanding the scope of the system. It is also the easiest discreate nameable thing to get a handle on. It acts like the end of a thread used to unravel a ball of knots.

Another thing that might stand out to the OO fans reading the list above is that there is no mention of roles or behaviors or all the other jargon of the OO world. I find that interesting because I've been doing OO since the late 80's and it is virtually impossible for me to design software that is more than 500 lines without ending up having at least a few classes. Even before C++ became popular I was organizing C functions associated with specific data structures into dedicated files and designing dispatch handlers for the data structures. It just seemed like the right way to do things.

I'm guessing the reason for this is that data is a long hand that reaches throughout a system. Or maybe it would be better described as blood and oxygen. Every aspect of the system needs it regardless of its role front end or back end.

Classes and objects are marvelous ways to organize code. They can also be a great way to get a user to cough up requirements. Users often have a hard time talking about data apart from the things they do with the data. However, code and end-user requirement gathering are only two of the many ways a system needs to be categorized, sliced and diced in order to understand it. On the front and we need to understand workflows. On the database we need to understand normalization or we won't really get the benefit of our database's SQL engine. Somewhere in between we need to understand aspects: large swaths of functionality that are content independent.

To fully understand a system it is also important to get a handle on how all of these different ways of categorizing data, code, and end user functionality relate to one another. The learning strategy elaborated below tries to help in that process. Whether it succeeds is for you to decide. But if it seems counter intuitive at least consider trying it the next time you need to learn a system quickly. You might be surprised at the results.

I apologize in advance for the list like nature of this elaboration. Partly I don't have the time to expand it fully right now. Also, I fear turning it into a narrative with examples would likely stretch this node to book length. Hopefully though I will have at least raised some questions and pointed out things to look for that might be helpful for others.

Step 1: Gather together documentation

You may have documentation. You may not. It may be up to date. It may not.

Even out of date documentation or incorrect documentation can be helpful if it gives you a sense of the design philosophy or a road map through the code. My first step is to skim through the documentation and make some sort of assessment of what is there and how much I trust it. However, unless the documentation is amazingly clear and well written, when first learning a system I usually take the documentation with a grain of salt. I like to get down and dirty into the guts of things and see how it all works with my own eyes.

Even when I trust the documentation and it seems relatively complete I still use the remaining steps to as a road map through the documentation and check list for my understanding of the system.

Step 2: Experiment with the front end

From the user's point of view, what does the system do? Do this as a brief survey to get a "feel" for the application or system. The goal for this step is to provide context for the more detailed study of how the system is implemented.

Step 3: Study permanent data streams and persistant data

Step 4: Explore the type systems

Step 5: Get to know where the code lives and how it is processed

Step 6: Scan and categorize the codebase

If you are lucky a certain amount of categorization may have been done for you. If not, the only way to do this is to look at each file! Here are some things I do to speed up the process:

Step 7: Study backend CRUD operations

Step 8: Study front end CRUD operations

Taking the same types as before, study how those types are displayed to the user.

Step 9: Study permissions/security infrastructure

Step 10: Study a well defined data collection/functional area

Nuf said.

Best, beth


In reply to Swallowing an elephant in 10 easy steps by ELISHEVA

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.