in reply to Parallel structures are NOT maintainable
in thread using references as keys in a hash.

If two data structures are related, make that relationship OBVIOUS.

I agree with that.

Parallel data structures are not obviously related.

It seems obvious to me that if you see them assigned together, they're related. I did say it was a matter of style, however, and I expected some people to have a strong preference for the nested structures. I do use the nested structures in some cases, when what I want to do is a little more complex, or if there are multiple levels of nesting, or some other good reason. And I gave the example of using nested structures first. Don't read more into my statement about parallel structures than is there.

In fact, it's a maintenance nightmare. Let's set up a thought experiment.

Thought experiments can lead you to conclude that a heavier object will always fall faster than a lighter one. (They can also be useful, but you have to take them cum grano salis.)

I am your maintenance programmer.

Oooh, oooh, can I imagine that I named all my variables with single characters and used recursive nested evals wherever possible? ;-)

I come along and are told there is a bug in the fubar() function and I need to fix it in 24 hrs. I go and realize that I need this value to make it right. I don't know that the value is in this fourth data structure. But, I need to fix fubar() right now. So, I add some crazy structure to get that fourth value into fubar(). The code is now worse.

The code will always be worse when someone who is not familiar with the code attempts to fix something right now without understanding how it works. No amount of wonderful data structure will change that. (This is not an argument for bad data structures; I'm merely pointing out that no data structure can prevent the scenerio you describe.)

Furthermore, unless I'm missing something, there's nothing magic about the syntax of nesting that will alert the unfamiliar programmer to the existence of more data than is being used in the piece of code he's viewing. A simplistic example...

sub foobar { my ($object, $result); foreach $object (@_) { $result .= "Title:\t" . $$object{title} ."\n" . "Author:\t" . $$object{author} ."\n" . "-------------------------------\n"; } return $result; }

Will the programmer know to look in $$object{ISBN} for the piece of data he needs to fulfill the change request? Maybe, but if so it's not any more obvious than (with parallel structures) looking in $isbn{$key}. If he reads through the well-commented code, he'll find it either way.

Of course, if the code is more complex and has a larger number of fields, then the nested structure can be traversed more efficiently, avoiding the bug in the first place...

sub foobar { my ($object, $result, $f); foreach $object (@_) { foreach $f (sort @fields) { $result .= "$f:\t" . $$object{$f} ."\n"; } $result .= "-------------------------------\n"; } return $result; }

But the original poster is talking about what is currently a single hash storing a single value for each key, and I was suggesting also storing the unstringified reference used to create the hash key. That's a total of two fields: not complex enough to really need the nested structure, IMO. Yes, the nested structure will solve the problem nicely, but the parallel structure will also work.

Note that I'm not saying that parallel structures are better, or even that they're as good in every case; I only said that which you use is a matter of style. The program will get the same result either way.


sub H{$_=shift;while($_){$c=0;while(s/^2//){$c++;}s/^4//;$ v.=(' ','|','_',"\n",'\\','/')[$c]}$v}sub A{$_=shift;while ($_){$d=hex chop;for(1..4){$pl.=($d%2)?4:2;$d>>=1}}$pl}$H= "16f6da116f6db14b4b0906c4f324";print H(A($H)) # -- jonadab

Replies are listed 'Best First'.
Re2: Parallel structures are NOT maintainable
by dragonchild (Archbishop) on Feb 24, 2003 at 16:13 UTC
    A few comments:
    1. I'm merely pointing out that no data structure can prevent the scenerio you describe.

      No, but it can mitigate it. I am saying that, all things being equal, nested data structures contain more information intrinsic to their structure that parallel ones do.

    2. Will the programmer know to look in $$object{ISBN} for the piece of data he needs to fulfill the change request?

      Yes, he/she will. Why? Because they can go look at the definition for $object and see that there is an ISBN member. (You do have object definitions, right?) Failing that, Data::Dumper is one of the maintenance programmer's best friends. But, Data::Dumper doesn't know about that parallel data structure, does it? If it doesn't, it takes more time for me to find the right solution.

      (Nit: Use $object->{ISBN} instead ... $$object{ISBN} can lead to subtle bugs. Another maintenance headache, not a style issue.)

    3. The program will get the same result either way.

      This is a horrible statement, especially in this argument. Not only is it a tautology, but I don't care what hoops the computer has to go through to understand what I want it to do. COMPUTER RESOURCES ARE (NEARLY ALWAYS) CHEAPER THAN HUMAN RESOURCES.

    Let me put that last point another way. My programming services cost more per week than a top-of-the-line linux server. People like merlyn and others cost at least twice that. Is it worth it to you for me to spend 40 hours figuring out how to save a meg of RAM?

    It is worth a lot of money to write code that encapsulates as much information as possible in as many ways as possible that will be guaranteed to change as the code changes. Parallel data structures are a matter of style, yes. They are a poor choice of style because they will cost more money in maintenance than nested structures.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      Nit: Use $object->{ISBN} instead ... $$object{ISBN} can lead to subtle bugs.

      I meant to ask before, and forgot: can you elaborate on this? Assuming for the moment that $object is a real reference here, not a "symbolic reference" string (we'll pretend for the moment that I was using strict; if the program were complex enough to span more than about half a dozen subroutines I would be), what subtle bugs would (or could) my syntax lead to? At first I thought you meant that someone might write $object{foo} by mistake instead of $$object{foo}, but then I realised warnings or strict either one would catch that, so you must be talking about something more subtle... but what?


      sub H{$_=shift;while($_){$c=0;while(s/^2//){$c++;}s/^4//;$ v.=(' ','|','_',"\n",'\\','/')[$c]}$v}sub A{$_=shift;while ($_){$d=hex chop;for(1..4){$pl.=($d%2)?4:2;$d>>=1}}$pl}$H= "16f6da116f6db14b4b0906c4f324";print H(A($H)) # -- jonadab
        There are a few reasons.
        1. You're talking about objects. Objects have methods. You cannot use the indirect dereference with methods. So, now you have one syntax for getting values from the object and another to call methods. Much better is to have one syntax for both. (This is why Perl4 uses that indirect dereference and Perl5 uses the arrow. But, Perl5 is backwards compatible ...)
        2. $$object is the syntax for soft references. That is the only way to do soft references. So, whenever I (your doughty maintenance programmer) see that syntax, I start thinking about why you want to have a soft reference there. But, you don't have a soft reference there. You're using cargo-cult programming learned from someone who never made the jump from Perl4 to Perl5. This is confusing. You could have made it clear that there is no soft reference.

          (Soft references are not always bad ... but you shouldn't use them until you know why they are always bad.)

        So, it's not a matter of things will necessarily break. In fact, I work on a 250K line system where that syntax is used everywhere. (Lots of legacy code. *shudders*) However, I wouldn't trust it. It's safer to do things the cleaner way.

        Update: Fixed mistaken notion re: symbolic references. (Thanks, tye!)

        ------
        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      Is it worth it to you for me to spend 40 hours figuring out how to save a meg of RAM?

      Maybe, maybe not. A meg of RAM may not seem much, but things tend to add up. If it's a long running process, a meg of RAM every now and then does add up. And suddenly, you reach the memory limit your OS allows for a process, requiring a restart of the process every two to three days. You might say, so what? But some business models just don't accept that.

      Abigail

      Let me put that last point another way. My programming services cost more per week than a top-of-the-line linux server. People like merlyn and others cost at least twice that. Is it worth it to you for me to spend 40 hours figuring out how to save a meg of RAM?

      Huh? Where did that come from? Fourty hours to find and fix a single small issue? If I spent an entire five-hour shift fixing some little thing like that, I'd feel like I didn't get anything done that day. In a thirty-hour work week, I could rewrite the application from scratch and have time left over to unstick printers, bug the APCC tech support people about our ongoing PowerChute issue, teach an Introduction to the Internet class to a group of senior citizens, reboot my coworkers' Windows systems for them as necessary ("I restarted it. It should be better now."), run a couple of custom reports for my boss (and write one-off Perl scripts to turn them into meaningful data), and redo the stylesheets for the cgi scripts in question just because I felt like it.

      Either we're talking about programs so different in size that it's not remotely meaningful to talk about them in the same conversation (as I suspected when I read your previous message upthread), or else one of is seriously mispaid (since I don't make anything like the kind of wage you are talking about).

      I said that for complex situations the nested structures are better, but it seems to me that deeply complex programs are the only kind you are willing to concede might ever exist. Take a deep breath; some of us do simple stuff sometimes.


      sub H{$_=shift;while($_){$c=0;while(s/^2//){$c++;}s/^4//;$ v.=(' ','|','_',"\n",'\\','/')[$c]}$v}sub A{$_=shift;while ($_){$d=hex chop;for(1..4){$pl.=($d%2)?4:2;$d>>=1}}$pl}$H= "16f6da116f6db14b4b0906c4f324";print H(A($H)) # -- jonadab
        Actually, it's both - I do work primarily on large and very-large systems. And, I am seriously overpaid by my current employer. :-)

        And it came from the fact that I have been told to "save a Meg of RAM, not matter what it took". *shrugs* Reminded me of the Dilbert cartoon where the little bald guy says "I'm going to write me a minivan this week!"

        ------
        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.