Literate programming, why not routines instead?

From:	Edward Keith
Date:	04 Jan 1993

Why should I take the time to learn CWEB? What is the advantage of literate programming over extensive commenting and good design in a traditional language? I do most of my work in C. I write short functions, each with a descriptive header explaining what it does and how it does it. I comment each of the parameters and all variables when they are declared. I then run the code through a code formatter and cross reference generator. I still find the code hard to read three months later.

I read the article in the Jan. 1993 Computer Language, and have been following this list for several weeks. I find the listings in CL and posted here even harder to read than most C code (This is probably because I have been reading C for eight years, and saw CWEB for the first time last month). In the article Silvio Levy says, "The gain in clarity obtained by using CWEB over C should now be obvious." Maybe I am a little slow. Could someone please explain it to me?

From:	Marty Leisner
Date:	06 Jan 1993

I am not sure about the gain in clarity. I have no problem reading quality C code months after the subject. I generally don't use a formatter. I just follow formatting rules. You really aren't supposed to look at the webbed code (from what I have seen...you look at the formatted comments, and the computer looks at the code). I too am unsure whether its worth the time to learn and whether it improves readability (although I think texinfo is a "good" thing -- having the documentation on line and printed).

From:	Lee Wittenberg
Date:	07 Jan 1993

Why should I take the time to learn CWEB? What is the advantage of literate programming over extensive commenting and good design in a traditional language?

I can't speak for everyone, but there are 2 reasons why I switched to literate programming from reasonably well commented, fairly clean code.

1. Maintenance. Not only can I explain why I am doing something while I am writing the code (these kind of explanations are not only awkward in normal programs, but often get in the way and overshadow the code), but I can also make little notes to myself about things that need to be changed, where an algorithm came from, etc (and put pointers to these notes in the index!). The index of identifiers is invaluable in trying to figure out someone else's code. In fact, my "Road to Damascus" came about when I was trying to get CWEB working on my PC. There was an obscure bug involving pointer arithmetic in CWEAVE. I spent a little over a week (during off-time at an ACM conference) poring over the woven listing, and found the bug. I would have given up on a non-literate program of similar size (and have on several).

2. Structure. Since TANGLE puts all the code sections in proper order, I can concentrate on writing the program in the order that I feel best for purposes of exposition. I can write the program for human beings, rather than for the compiler (I think this is the most important point of all). I can also modularize code without having to write a procedure (with its attendant overhead) by using a named section.\footnote{perhaps this is why Knuth used the term "module" in early versions}

In any event, I won't do without my literate programming tools. I have recently had to do some programming in PAL (Paradox Application Language). When I discovered that Spidery WEB wasn't capable of dealing with PAL (through not fault of its own -- PAL is fairly insane), I downloaded noweb, adapted it to work under DOS (a rather painful process), and now use it for all my PAL work. Even the programmers in the office who do not use noweb have no problems reading the woven listings. Hope this goes a bit toward answering your question. The best advice I can give is: "Try it, you'll like it".

From:	Mike Yoder
Date:	08 Jan 1993

My experience with literate programming has given me a different slant on the issue than most people have; I think the principal benefit is not so much that you get good documentation (you don't always) but that you get correct documentation. I don't mean user manuals, which are generally correct but irritatingly ambiguous due to the nature of natural languages: I am referring to documents that purport to describe what a program's data structures and algorithms are.

Without literate programming, my experience says that for programs longer than, say, ten pages of code, the odds that such a document will be correct are zero. I am not being ironic or exaggerating: I have literally never seen documentation in such cases that was useful. In most cases, either the documentation was written before the program was (and never updated), or it was written just afterward and a very incomplete job was done. I particularly remember one document describing a compiler IL that had quite correct detail about the tree portion of the IL, but was sketchy on the leaf nodes and had virtually nothing on the symbol table. It isn't all that helpful to be told that the tree node representing binary plus has two sons; you could have surmised that. But this was the easiest part of the documentation to write, and the writer was probably under heavy time pressure and trying to get as much down as possible.

It is possible, of course, to work with the program directly, but this changes maintenance from a science to an art. This is not intrinsically bad, unless human lives are involved. But in any large program where no "big picture" exists, fixing a bug consists of finding a likely-looking spot and changing it to what feels right in the hopes that it is right. If it doesn't work, you repeat the process. It would be better to know that such-and-such a routine is supposed to deal with all comments, or macro expansions, or whatever, because this can reduce the amount of code you must examine by an order of magnitude or more.

Documents get out of date because they are separated from the source, and so producing them in addition to the program becomes a two-pass process. Besides this, the documentation need not have any obvious 1-1 relationship to the program; so it may be a nontrivial task just to determine what parts of the documentation need to change after the program is modified. No wonder that most programmers take the easy way out and put off fixing the documentation forever.

Why literate programming fixes this is partly obvious and partly due to psychological effects I do not claim to understand. It should be clear, though, that when the documentation you must change is at most an editor screen or so away from your program text, there is much less of a psychological barrier to your changing it at the same time as the program. There is one other obvious reason that literate programming helps: it makes the documentation and program be done at the same time. Once the program works, very few managers or programmers are all that keen on spending several weeks producing quality documentation; there's always other ways to spend this time that look more attractive--such as doing firefighting on the project that's behind.

From:	Eric van Ammers
Date:	15 Jan 1993

I have been practicing literate programming for a long time and I feel very happy with it. But very often when I try to explain what literate programming is about, I get confused if people ask me what the advantages of literate programming are compared to working with small independent well documented routines. My experience teaches me that there is a big difference indeed, but until now I am not able to make this explicit. Note that Knuth in his 1984 paper in The Computer Journal also avoided this point. My question to you, LITPROG netters, is to give me your opinion and suggestions with respect to the problem above.

From:	Lee Wittenberg
Date:	20 Jan 1993

The FWEB User's manual has a nice discussion of this issue (section 4.11 in the version 1.23 manual). If you can't get hold of it, I am sure John Krommes would give permission to quote it here. Knuth does address the issue somewhat (I recall) in one of the papers in his new Literate Programming book. I am not sure which one it was. Does anyone out there know the reference?

From:	David Kastrup
Date:	20 Jan 1993

Often you cannot really avoid lengthy routines without having to formulate formal parameters, calling conventions etc. In the WEB approach (not in all literate programs, of course), documentation can include readable mathematical formulas, which I consider a boon. The problem is, that small, well documented procedures do the job as well. However, you have to formulate calling parameters and other conventions. Not only that they complicate comprehension slightly, you will simply not find any programmer intent on serious work doing that.

The advantage of literate programming over small, well documented procedures is simply psychological: splitting into sections a more complicated thing is easy and done on the fly, splitting a procedure into distinct procedures is a pain in the neck, needs additional consideration, reediting and restructuring. So it simply isn't done. Chances are, when you get both a literate programming program developed in haste and with only a small amount of documentation beside the code, and a procedural approach, that an outsider will with the literate programming understand much more after a reasonable investment of time, than with the normal program. That is because the structure of the program is more obvious, although not necessarily by being split into disjoint procedures.

From:	Mike Yoder
Date:	20 Jan 1993

The reason you are "confused" is that the question is somewhat like being asked, "Are you still beating your wife?" The problem is the presupposition that is behind the question. There is no such thing as "small independent well documented routines" for any program of a significant size. Many people think they write them, but if you try the only empirical test that matters--namely, seeing what happens when someone else tries to use these routines when the author is gone--you will almost certainly find it works badly. Now, if you confront the author with this fact, the last thing that will happen is that he or she will say "Oh, drat. I guess they weren't well enough documented after all." What they will instead say is "that person was just too dumb to understand my code."

Please don't take this example too literally; I am trying to get my point across in one try, and I need vivid imagery. Unfortunately, I suspect this line of argument will be unconvincing, and you will have to find another one. There will probably be responses saying they have seen examples where my claim isn't true, but I am going to disqualify a whole slew of them right off the top. I, too, have had cases where the approach seemed to work when the original author was available for consultation. But this is not documentation; it is documentation plus folklore, and the folklore is usually critical. As far as I am concerned, documentation is not adequate unless it would suffice if the original author fell under a bus and became permanently unavailable to the new programmer. This is rare, and literate programming does not guarantee it, but it makes it much more likely.

I also realize that it is possible to get by without really understanding the code; this approach "works" in roughly the same sense that Communism "worked" in the U.S.S.R. up to the point it collapsed. Good luck with your discussions.

From:	Glyn Normington
Date:	21 Jan 1993

Literate programming lets you structure your program into smaller chunks without the run-time overhead of a subroutine or the effort of writing a macro. It also has the advantage that a literate program is more than a collection of program fragments as there may be high-level design documentation included which would not fit nicely into a convention program. The literate programming tools I use allow multiple programs and other files to be generated from a single literate program (which may itself be split into multiple files using an imbed mechanism). This enables better grouping of programs and data which form abstract data types, which our base programming language does not support.

From:	Marcus Speh
Date:	21 Jan 1993

In my tirade, Philonous (Ph) is a friend and user of the WEB environment. He is arguing with Malevolent (Ma) who's finally going to join the literate programming family. Eventually he'll pick up a better name for himself. [Malevolent still has a hard time to believe though that CWEB++ is the "only True Web", and he will start and stay with FWEB], this Socratic dialogue could actually have happened like this.

They start as suggested by Eric van Ammers:

Ma: "What are the advantages of literate programming compared to working with small independent well documented routines?"

Ph: "One can still work with small, independent routines. They're just better documented now."

Ma: "I can do well without TeX for documentation."

Malevolent obviously does not believe in DEK. You wonder who's paying him. Philonous does not really know how to answer to that. He probably does not like troff. Or maybe Malevolent has got an eye problem?

Ma: "In fact, I hate to spend to much time thinking about how to explain things to others when I haven't even finished the program."

Ph: "Before I saw WEB, I wasted lots of time finding the right balance between doc and code. With WEB, it becomes easier to write doc along with the code. And update it."

I have only experiences with FWEB, and I haven't been using it for more than one year. Before that, quite a lot of my time went into trying to improve on the delicate balance between documentation and code. None of my private efforts were really satisfying, though. Maybe also because (like many people outside of CS) I never really learnt how to program. Malevolent knows much more about programming, but he's got other things to do as well:

Ma: "But isn't this a hell of a lot of extra effort?"

Ph: "When I saw FWEB, I wasn't even put off because of the extra effort in learning something new. Though I must confess that I asked people in my field of research how long they had needed to get accustomed to the new tool."

Since then, I freely give away the magic number of "10 days"--- if you know [La]TeX and the language(s) you want to code in.

Ma: "Ok, Ok, Ok. Now, if you compare how much time it costs you and how much you gain?"

(Malevolent is a tough calculator, it seems. He obviously got the message of the zeitgeist.)

Ph: "I cannot speak for you. literate programming also is a matter of taste. It is a useful tool for me. It definitely increased my level of reflection upon what I was doing. It saves me time because the programs mostly do run in the first place - it costs me time because I now like to treat many otherwise neglectable pieces of code like little diamonds - and cannot be sure that this will pay besides aesthetics."

Ma: "Of course I have heard about WEB. But I do not know anyone who is practicing it, really."

(Later, Philonous will tell him about the literate programming list.)

Ph: "True. The `evangelization' part sometimes is the most painful. There's no company working for WEB's success. No commercials placed. Thus, it definitely costs time because I am trying to convince my colleagues that they should try *WEB, too. But I am a born missionary anyway and so this meets my needs as well."

(He did not really have to emphasize the last point...)

The time for our key-hole listening is running out. It suffices to say that the two are having a lot more to discuss. At the end, Malevolent (overloaded with manuals, introductory texts, FAQs, eager to try WEB) wants some advice how to evangelize others:

Ma: "Assuming you meet someone who's more benevolent than I am-- how're you proceeding?"

Ph: "Upon meeting someone who likewise seems to suffer likewise, and who signalizes a genuine interest in learning something new, I first show him a HUGE woven output [yes, I am carrying such a volume around mainly for that purpose]. Before putting the word "WEB" into my mouth I want to hear a SIGH when he is confronted with something which looks unlike anything he has seen yet. Even better if I have presented some more or less complicated program in a talk before: then people are lost and WEAVE's output comes handy to explain---it has got tables, it has got plots, maybe, an index, a table of contents--- Fine. Then the victim usually asks: `why did you put up all the extra work?'"

Ma: "That almost sounds like me, before I had seen the light!"

Ph: "Yes, that is the moment of truth indeed." (Timothy Murphy would much better know how to put it, I am sure)

Ph: "I start explaining some things for real [forcing the victim to recur to the beautiful output in regular intervals determined by the amount of healthy skepticism he's mobilizing to shield himself]. Eventually I show a not-too-complicated .web file. And I give him the speech which I gave you already, my friend. Of course: If I have a FWEB-FAQ (*) output at hand, I will pass it to him, too."

You have to judge whether this may happen with your colleagues in the same way. Mine are definitely special in that many of them are used that everything comes to them pre-digested. If that is not the case, they'd rather cut on their needs: for fine documentation, for well-structured code etc. Probably this will not hold for the majority of literate programming's readers.

From:	Michael Koopman
Date:	21 Jan 1993

I am highly under qualified to respond to this question, therefore, I feel it is my responsibility to broadcast my naivety by the widest distribution channel to which I have access, namely, the net. I admit only limited "book knowledge" of literate programming, including information from texts, journals, magazines and coffee houses. Perhaps others, like Eric van Ammers, who have first-hand, experiential knowledge of literate programming can judge the validity of the following benefits which I have presumed.

First, and foremost, literate programming provides associativity or "links" between the comments and the coding. This seems obvious in the name "Web" and a plausible influence on the name choice. This associativity knowledge is used, primitively, by literate programming compilers as I know them. That is, the links are used strictly as handles to the associated information. However, this associativity allows for "meta-compiler" activities not easily supported by well-documented code which is not literate programming. The meta-compiler activities could include such actions as automated commonality detection leading to abstraction via machine reasoning. With limited natural language processing of the comments, in conjunction with the associations identified by the code linkage, elements of the code such as contexts and intention may be derived. A meta-compiler which interprets software with such abstract knowledge makes possible software engineering methods I can not even imagine at this time. Advanced compilers could be developed which perform abstract knowledge interpretation of well-documented modules, but literate programming should make such activities easier. Qualified comments about intentional programming are requested, I merely prattle.

It also seems literate programming can help to bridge the gap between the languages. Being unlikely that one code paradigm can offer the "right choice" for all programs, literate programming should help in designing and maintaining large programs. Such are often composed of large subprogram modules in different languages. This requires an literate programming system which accepts more than one compiled language code component, e.g., C, C++, Pascal and Smalltalk.

From:	Zdenek Wagner
Date:	21 Jan 1993

At the beginning I would extend the postulate about non-existence of small well documented procedures. From my own experience I know that my own small well documented procedures do not work when transferred into another program half year later. However, I can see how procedures can live together with literate programming. I am now webifying my old C++ programs. During past non-literate times I developed a bunch of general procedures and pure virtual classes which I put into private libraries in order to save compilation and linkage time. My intention for the future is to write such procedures and classes in web, compile them separately and place them into libraries. In this way I would take advantages of both literate programming and independent procedures and moreover I will save disk space since good web files tend to be long.