Re: (OT) Generated Code vs. Libraries

Replies are listed 'Best First'.
Re^2: (OT) Generated Code vs. Libraries by William G. Davis (Friar) on Oct 22, 2004 at 03:20 UTC
I completely disagree. Code generation is a sign that you've found a way to get the computer to do something tedious and repetitive for you, thus saving valuable programmer time and eliminating possible human error. Some examples: MS decided not to provide any reliable way to get stub error messages for socket errors. You can use FormatMessage() with GetLastError() (like how strerror(errno) works on Unix), but not with Winsock's WSAGetLastError(), for some reason or another. So I went to the page documenting the error codes returned by WSAGetLastError() and found each code had a short description beside it. I wrote a little Perl script that used LWP to fetch that page from the MSDN website, parse the HTML and extract each error code and error description, then generate C code for a lookup table of error strings. Problem solved. I have a subsystem in a C library of mine that basically just encapsulates a few common structs, providing New() constructor routines, Get() and Set() accessor routines, and Destroy() destructor routines. The documentation for these routines would be quite predictable, so I wrote a Perl script that parsed the header file (which itself was automatically generated from the source file) and generated POD documentation for each routine. Now I only need to add a few additional, routine-specific bits of information here and there and it's done. Problem solved. I wanted an XSUB interface to some C code of mine. I wrote a code generator that generated the XSUBs for me along with some special functions and CODE: and PPCODE: sections to do some neat stuff way beyond what h2xs (another code generator) is capable of producing. Time saved, problem solved. (xsubpp itself is a code generator, by the way.) You'll probably say that I shouldn't have had to generate that lookup table, that MS should have made FormatMessage() work with WSAGetLastError(), or that C should be C++ and make encapsulating abstract data types easier, that XS ought to be more flexible, or whatever. The fact is that we don't live in a perfect world with perfect software. Stuff we use often times doesn't work or doesn't work very well, and we as developers find ourselves having to pick up the pieces. Code generation can help with that, doing the repetitive stuff for you and saving some serious time in the process. See The Pragmatic Programmer, page 102, for a discussion of this.	[reply]
Re^3: (OT) Generated Code vs. Libraries by perrin (Chancellor) on Oct 22, 2004 at 03:47 UTC
I think you're missing the point. The idea is simply that anything you can do via code generation could be done using subroutines instead, provided you are using a high-level dynamic language like Perl. In addition, your examples are mostly not what I would call code generation. Generating a lookup table from an HTML page is basically data manipulation and could be done as a config file rather than code. Generating documentation is, well, documentation. It's human-readable text, so you can't handle it as a library call the way you could with other code generation situations. I don't know enough about XSUB to comment on your last example, except to say that the rules are different in static languages like C where you really may not be able to do certain things as a subroutine. The canonical example for this discussion is generating a set of classes for manipulating database tables. Class::DBI does this in perl, and it uses code generation, but it does the generation on the fly at run-time and doesn't produce an intermediary source code that can be hand-edited and get out of sync. The use is still questionable in my opinion, but not as bad as it could be.	[reply]
Re^4: (OT) Generated Code vs. Libraries by William G. Davis (Friar) on Oct 22, 2004 at 04:51 UTC
I think you're missing the point. The idea is simply that anything you can do via code generation could be done using subroutines instead, provided you are using a high-level dynamic language like Perl. Well, you condemned code generation in general, so I gave several examples of where I found it appropriate and beneficial. It's true that the common denominator in each example was C, but there are times I've used generators for Perl too. Generating a lookup table from an HTML page is basically data manipulation and could be done as a config file rather than code. Maybe, but then I'd have to write C code to open the file, parse each line, and build up a data structure for the codes and strings dynamically. Pain in the ass, and much, much less efficient than the static array generated by the Perl script. Generating documentation is, well, documentation. It's human-readable text, so you can't handle it as a library call the way you could with other code generation situations. The Perl script generated POD, which is a form of "code," is it not? I understand what you're saying, though. But there are times you need code that's predictable enough to be automatically generated but not predictable enough to be encapsulated away by a few routines. Yes, this is much rarer in a language like Perl, but it still happens. I don't know enough about XSUB to comment on your last example, except to say that the rules are different in static languages like C where you really may not be able to do certain things as a subroutine. Converting linked lists to arrays on return, making sure everything gets garbage collected correctly, and giving each XSUB a shorter, more Perl-ish name were just a few of the problems it solved quickly for me. Class::DBI does this in perl, and it uses code generation, but it does the generation on the fly at run-time and doesn't produce an intermediary source code that can be hand-edited and get out of sync. The use is still questionable in my opinion, but not as bad as it could be. The generated code getting edited and becoming out-of-sync with the generator is only problematic if you don't know what you're doing. Hunt & Thomas described (The Pragmatic Programmer, pp. 103-05) two types of code generators: active and passive. Passive generators generate something once, then the output is taken and used and modified as needed (e.g., the XSUB generator described above). An active code generator can be used again and again as its sources of input change (e.g., the Winsock error lookup table generator described above). Just know which type of generator you're going to write and what you plan on doing with its output, and everything should work fine.	[reply]
Re^2: (OT) Generated Code vs. Libraries by hv (Prior) on Oct 22, 2004 at 11:03 UTC
I'm with William G. Davis in disagreeing with you on this. Code generation can mean that you've managed to express things in terms of a higher-level abstraction. I don't see the relevance of your distinction between code generated at runtime and that generated earlier. It would be perfectly reasonable to invent a new language (whether a generic language or something more domain-specific), and have the implementation of the language compile it to perl code. In fact a C compiler does exactly this: it generates an assembler program from the C code. This also I think answers the one issue I recognise from perrin's comment, on the danger of files getting out of sync if you hand-edit the intermediate results - it is a matter of expectation (I don't expect to edit the assembler source that the C compiler produces), reinforced by infrastructure (eg setting the generated files read-only) and protocol ("this is the procedure to change it"). Hugo	[reply]
Re^3: (OT) Generated Code vs. Libraries by perrin (Chancellor) on Oct 22, 2004 at 15:11 UTC
If you don't see the relevance of the distinction between code generated at runtime (sometimes called "active code generation") and that generated earlier ("passive code generation"), I'm guessing you haven't worked on a large project that used passive code generation. The problem is that there will be an emergency fix and someone will edit one of the generated files because it's much easier than trying to understand and change the code generation code. And then you are in big trouble. Anyway, both of you seem to be implying that I said something like "generating anything from anything else is always bad." That's not what I'm saying at all. Let me try to re-state it more clearly: In most cases where you could use code generation to solve a problem, you could also use a data structure and some subs to solve it. To quote from the wiki link I posted, "Anything you can do by generating code, I can do by calling data driven subroutines." Using subs is better because it is easier to understand (no need to parse two-levels of code at once in your head) and avoids the danger of hand-editing. I'm not claiming that there are no situations at all where code generation is required. A common reason to use active code generation is the performance gain you can sometimes get with it (e.g. templating systems often generate perl code from templates). Even if you completely disagree, the wiki link I posted is pretty interesting reading, and makes good points on both sides.	[reply]
Re^4: (OT) Generated Code vs. Libraries by hv (Prior) on Oct 24, 2004 at 10:34 UTC
I'm guessing you haven't worked on a large project that used passive code generation. The application I've been working on for the last 4 years has had a gradually increasing amount of its code generated, some of it at install-time and more at run-time, and I expect that amount to continue to increase. The code is currently a little short of 50k lines. We have not had problems with hand-editing of the generated code. On the rare occasion that an urgent problem has been fixed by hand-editing, the immediate next step has always been to update the source to apply the equivalent fix, test and reinstall. Hugo	[reply]