Re: Code for elegance, code for clarity
by tilly (Archbishop) on Jan 12, 2004 at 16:12 UTC
|
Golf has no value in production code. Your "elegant" line is one that I would fail in a code review, and I think that most reasonable people would likewise.
On the home-grown csv_split. I detect the following bugs and various issues:
- You are way too fond of putting multiple lines on one. Hitting return is cheap, and makes code easier to scan.
- Walking through a string with s/(.)// is inefficient. (Walking the string is O(n*n).) Use m/(.)/g instead - that is what it is for. You can rewrite your check for "" with: /\G"/cg (see perlop). This makes for a O(n) algorithm. (Unless $` or $' were used anywhere in the program. In which case split might turn out faster.)
- A toggle is better written as $quoted = !$quoted; That obviously and unambiguously toggles. To verify what $quoted = 1 - $quoted; did I had to to scan the code to see if $quoted was set anywhere else. Why make the maintainer do that work when a clearer alternative exists?
- Your code will not handle returns embedded in a quoted field. If having the bug is OK in your application, document that limitation with a comment. Else fix.
- Speaking of comments, anyone who doesn't understand the CSV format reasonably well is likely to wonder whether you made some serious mistakes. A brief English description of the format would make sense.
And one overall comment. Coding is about getting the job done efficiently, while leaving yourself able to do so again in the future. It isn't about proving that you are l33t. It isn't about using as few lines as possible. It is about being effective.
Therefore don't think of style as something that you have to squeeze through an external code review. Think of it as something that you enforce with an internal code review because good style lessens how much work you have to do before you can productively work on your old scripts. | [reply] [d/l] [select] |
|
To me, elegant code shows both clarity and inspiration. It can't be a little unorthodox, but the logic must always be transparent.
I would likewise complain in a code review. Fortunately, I am not subjected to them, as I find the concept fairly insulting. I think sometimes it's better to just say "Here is how we want things to be coded" and then talk to the person if you think something can be done better. At work, I hate having to play defense -- "can't we all just get along?". I hate getting assaulted by groups of people at once, like when three people walk into your tiny cube and start looking over your shoulder. Yep, readability and standards are important -- but they can be taken too far.
But yes, one or two short line comments would do wonders in maintainablity. Proof of wizardry does nothing for me. I don't think the "elegant" example was elegant at all. I thought it was quite un-elegant. It did the job, it worked, and it wasn't horribly ugly -- but it was not elegant.
What I want to see is code that is instantly recognizable -- something that doesn't require me to stare at it for one or two minutes to understand. It's ok to require Perl knowledge to understand your code, but I'd rather see more baby-perl rather than golf-perl at work AND on CPAN.
Though, when reading all of this, don't lose the art -- the soul of programming can be lost through coding standards, "readability", and such. Look at how un-fun it can be to write Java when everyone is complaining about how you must always follow convention X-Y-Z or pattern A-B-C. There is some merit, but there is also too-much-of-a-good-thing. In this case, the near-military enforcement of some questionable policies has made the users of this language into a near parody of themselves. A fine example that "too-readable" and "too-orderly" means "way too much code to read and too much work to do".
ex:
Thingy.getThingy().getOtherThingy().GetDocument(EndOfRant(),EndOfRantChecker(false,true,true);
I guess all things can be overdone or underdone. Still though, back to the point, I don't think elegance has been achieved quite yet.
| [reply] |
|
| [reply] |
|
|
|
| [reply] |
|
'delirium' ne 'l33t'
Tilly, I apologize if you were tempted into more than a cursory glance at my sample code. As I told Podmaster, I wasn't intending to debate the merits of this code. If I wanted a thorough attack for CSV files, I'd go download a CPAN module.
First, ++ to you for /\G"/cg. One of the reasons I love posting things here is to increase my understanding of the language and learn more tricks such as this one that I had not stumbled across in the documentation before.
Let me respond to your individual points:
| [reply] [d/l] [select] |
|
| [reply] |
|
I am telling you that your "idea nuggets" would be easier for me to grasp and understand if you broke them out on separate lines. It is further my considered belief that if you tried it consistently for a period, you would find that your own comprehension was improved by that change. Which is why I suggested it.
The fact is that our brains find it easier to scan down than at moving left to right. This is why newspaper columns are 40 characters wide. Scanning left to right we have to parse the code to figure out where the divisions are. Scanning down we don't. Furthermore, scanning down and relying on indentation we can skip surprisingly large chunks without reading to find what we are looking for.
On the other changes, the big-O notation has to do with algorithmic scalability. When you s/(.)// you recopy the whole string. Copying a long string many times gets slow. m/(.)/g doesn't copy the string. This can matter.
On embedded returns, it is not obvious to me how to fix it. If you are handed one line at a time then you can't without having some way of requesting the next line. Also don't forget that the terminating newline of your row is not supposed to be part of the row of data that you are returning. What you should do with \n depends on whether you are in a quoted field.
And finally, m/\G"/cg is one that I found out about when I was sent a patch for Text::xSV to get around a regular expression bug. Yet another reason to expose your code for review from others. :-)
| [reply] [d/l] |
|
| [reply] |
|
Re: Code for elegance, code for clarity
by xenchu (Friar) on Jan 12, 2004 at 14:27 UTC
|
Just an opinion, but I would use the simplist, most straight-forward code I could for Production Code. The more understandable the code, the easier it is to maintain.
Production Code should have two major attributes:
1. It works.
2. It is easy to maintain.
Nobody cares about elegance at three in the morning when trying to fix someone's else's code. Clarity is always good. Make sure your documentation has lots of it.
xenchu
update: I really need to learn to type.
The Needs of the World and my Talents run parallel to infinity.
| [reply] [d/l] |
Re: Code for elegance, code for clarity
by dragonchild (Archbishop) on Jan 12, 2004 at 14:51 UTC
|
To me, elegance is where you have encoded as much information per N characters as I can easily read, plus having used features that strike me as appropriate, especially if I wouldn't have done it that way. A little thinking to comprehend the algorithm is acceptable, so long as I don't have to break out pen and paper to rewrite it. (That is obfu, not elegant. Obfus have their own beauty, but I wouldn't call them elegant.)
Your title implies that elegance and clarity are orthogonal qualities. I would say that clarity is an essential part of elegance. Another integral part of elegance is simplicity. Think about what you would consider elegant in a person. Some items that have classically defined an elegant woman, for example, could be:
- The line and angle of her neck
- The shape of her nose
- The way she moves
Elegance is about understated simplicity. So, to me, your first snippet isn't elegant. It's short, but I have to break out pen and paper to understand it. However, I would consider your second snippet elegant. It is also short, but it discusses one concept and does it in a simple fashion.
Your csv_split() ... It has elegant parts. Let me explain ...
sub csv_split {
# This is very elegant. It handles one concept and does so in a si
+mple and
# clear fashion.
# Two bugs:
# 1) use "return;", not "return undef;". The latter creates a o
+ne element
# list in list context.
# 2) I would use "shift // return;" (if you have the patch). Ot
+herwise,
# your function won't handle "0.0" correctly. If you don't h
+ave the
# patch, I would do a defined-check.
local $_ = shift || return undef;
my @array = ();
# This is not elegant - it's cute and clever. You're using 0 in bo
+th its
# numeric and boolean aspects. Now, you may not have a choice in t
+he matter,
# as Perl may not give you the tools to be elegant. (It might ...)
my $count = my $quoted = 0;
# The while definition is elegant - simple, understated, but clear
+. But, the
# use of $1 throughout the loop is definitely not elegant. It's ki
+nda ugly.
# Again, I'm not sure Perl gives you the tools to do the elegant t
+hing here.
while ( s/(.)// ) {
if ($1 eq ',' && !$quoted) { $count++; next; }
# The use of q// here is more elegante than '"', but it might
+have been
# better to provide a variable that hides the ugliness.
if ($1 eq q/"/) {
# The "$x = 1 - $x" idiom is very elegant. I've always lik
+ed it.
unless ( $quoted && s/^\"// ) { $quoted = 1 - $quoted; nex
+t; }
}
$array[$count] .= $1;
}
# Elegance might demand a wantarray solution here. It's unclear.
@array;
}
My version (with a bit more of what I consider elegant) would be (untested):
sub csv_split {
local $_ = shift // return;
my $is_quoted;
my $count = my @array;
while ( s/(.)// ) {
if ($1 eq ',' and not $is_quoted) {
$count++;
} elsif ($1 eq q/"/ and not ( $is_quoted and s/^\"// )) {
$is_quoted = !$is_quoted;
} else {
$array[$count] .= $1;
}
}
return if $is_quoted;
wantarray ? @array : \@array;
}
I believe my version also better delineates the flow control.
Update: Added tilly's toggle. I was trying to think of it, but couldn't, so left the x=1-x flip. The toggle is definitely more elegant.
------
We are the carpenters and bricklayers of the Information Age.
Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.
| [reply] [d/l] [select] |
Re: Code for elegance, code for clarity
by PodMaster (Abbot) on Jan 12, 2004 at 13:50 UTC
|
If I were you I wouldn't adopt that style.
Don't use next for nothing :) I'd write something like this
sub csv_split {
local $_ = shift || return undef;
my @array;
my $count = my $quoted = 0;
while ( s/(.)// ) {
if ( $1 eq ',' and !$quoted ) {
$count++;
} elsif( $1 eq q/"/ ) {
$quoted = 1 - $quoted
unless
$quoted and s/^\"//;
} else {
$array[$count] .= $1;
}
}
return @array;
}
Yes, I know I like whitespace (look at that cascading unless).
I also wouldn't bother to write && except when needed (nothing wrong with and)
update: also, when all you want is to match, use m//.
MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!" | I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README). | ** The third rule of perl club is a statement of fact: pod is sexy. |
| [reply] [d/l] |
|
Your code is not quite equivalent. Note that if the unless fails, his code will fall through to set $array[$count] while yours won't. I think that's a bug in the OP's code.
And for the OP, I think that reversing a Boolean is better expressed as $quoted = !$quoted rather than $quoted = 1 - $quoted.
There's no need to be destructive with s///; all you need is to walk the pattern with //g, and there's no need to special-case the empty quotes. More efficiency can be gained by reading strings of non-quote, non-comma characters. The code becomes very clear, though the regex is a little complicated.
sub csv_split {
local $_ = shift || return undef;
my @array = ();
my $count = my $quoted = 0;
while ( /([^,"]+|[,"])/g ) {
if ($1 eq ',' and !$quoted) { $count++; }
elsif ($1 eq '"') { $quoted = !$quoted; }
else { $array[$count] .= $1 }
}
@array;
}
The PerlMonk tr/// Advocate
| [reply] [d/l] [select] |
|
"next" makes a nice conceptual pocket (yes, I've long since been reduced to making up arbitrary terms). It trims down on the amount of nested conditionals, takes up less space, and makes (if not easier reading) for easier explanation to a third party.
I'm not intending to debate the merits of the above snippets. Sure, they could be better optimized, use more community-approved idioms, etc. I'm proposing instead that it's OK to code to your audience, and to code with the intention of explaining your algorithms to a group of peers when time is at a premium.
When left to my own devices without a timetable, my code becomes the nested and over-golfed hoo-hah posted at the top...which I enjoy tremendously.
By the way, where are you proposing I use m// as opposed to what is already in place?
| [reply] |
|
It's always interesting to see how different people write code. Except for the weird use of s/// I would have written that routine in much the same way as delirium. It probably has to do with how I "grew up" as a programmer. For instance, being used to C/C++, it's more natural for me to write && than and so I tend to prefer the former over the latter (except, of course, for instances where && has the wrong precedence).
| [reply] [d/l] [select] |
Re: Code for elegance, code for clarity
by ysth (Canon) on Jan 12, 2004 at 17:05 UTC
|
| [reply] [d/l] [select] |
Re: Code for elegance, code for clarity
by mcogan1966 (Monk) on Jan 12, 2004 at 18:17 UTC
|
And as I read these posts, I see above "Code for elegance, code for clarity".
To me, elegant code works correctly, and works efficiently. This is something that I've had to work hard at with my recent project. Taking code that works, and making it code that works faster is how I found my 'elegant' code. But now, I'm leaving it alone since there is little more that can be done without making some of the code unreadable. Someone else may have to look at this down the line, it shouldn't be impossible (just very difficult). | [reply] |
|
I think "elegant" is often a term for code that's clear to good enough programmers, and often rather unclear to less good programmers. However, really elegant code is good code to study to improve your own abilities.
Then again, my degree is in mathematics, where "elegance" is very often not the clearest way to express things.
| [reply] |
|
On the mathematics line, some physicist... (I forget the name Kinku? First name was Michael I believe), had an interview/show on TechTV about his perceptions of finding a Grand Unified Theory.
He said routinely rejected theories and equations because they were too complex, every time saying God's universe would be governed by beautiful (&simple*) equations. Though I think String Theory is off in unprovable never-never-land, there was a lot that I learned from that statement.
* = By simple, I mean short an elegant, without a lot of fudge constants and operators, I'm not saving the Creator would be afraid of a flux integral or something like it :)
| [reply] |
|
Re: Code for elegance, code for clarity
by delirium (Chaplain) on Sep 21, 2019 at 14:47 UTC
|
For the record, I have no idea what I was going on about 15 years ago. This whole meditation seems pretty silly. | [reply] |
|
Silly? It's fantastic! Look at all the replies and well mentioned advice. Some come off as harsh, but that's because we are imperfect beings struggling to express ourself in a language that is very similar, but not quite Perl (and by that, I mean the English language).
I'm writing this while sleepy. Couldn't read your code (well, it is not UNreadable, more like riding in a car on a 15km maximum zone). I guess that is the part of the language what drove people to Python.
Please humor me, and answer me this question: How was your growth process over the last 15 years. Did you loose your perl-golf in lieu of readable code (have you become you a well oiled cog in the machine instead of a perl-fu rebel?).
Personally I've lowered the perl-fu in my code, except for oneliners. There I pull open all registers to keep it short.
And sometimes... sometimes I open NYTProfiler and discover that that other construct I don't use nor like is faster... and then well, it takes a while, but I start using it, begrudgingly, until I forget it was not "mine".
edit: Oh, and perltidy. I code like a madman then use perltidy to get something readable. Madman you say? Yes I mix spaces and tabs... and other things I'm not proud of.
| [reply] |