In my view, documentation for an object-oriented system should consist of: 1. Description of the relationships between the classes/class clusters. Here a suitable modeling technique should be used; e.g. Rumbaugh and/or Booch. 2. How to use the class and each of its methods. 3. A description of the algorithm in each method, but normally there will be no need to split the method into more than one scrap. We can meet these requirements without using the literate programming paradigm.

I have used literate programming in a C++ project and really enjoyed the ability to introduce instance variables when they were first used in a method, but the need and reason for these variables should have been discovered and documented in the analysis phase. My need for scraps were, more or less, limited to containing a method in full. In hindsight, the project could have been documented equally well using good commenting practices and suitable modeling techniques only. I doubt that I will use a literate programming tool in future projects involving OO-languages but I will most certainly document my work to a much larger extent than before I learned of literate programming and realized just how much was missing from my previous documentations.

Just a few comments. Literate programming is not identical with the use of modules (scraps, chunks, ...). In fact these are just a means towards the goal of providing the highest possible level of documentation to the code. They are however very effective in the situation they are intended for, because they achieve two things at once: 1) move code dealing with details that are not discussed in the commentary of the current section outside of its view, so that the connection between the commentary and the (visible) code is more easily perceived; 2) provide a short abstract of the omitted code through its module name. The second point should not be overlooked; sometimes it is so effective that some sections need no commentary, because the module name already says it all (just like some mail messages need no body after stating their subject line). It is also a point that will be missed if one just uses function calls instead of module abstractions as a means of partitioning the code into small pieces; it is not likely that one will use function names with some 100 characters or so, even if the language allows it and the task of the function cannot be described well with fewer characters.

It seems to me that the distinguishing feature of the object-oriented paradigm might well lie not in the methodology itself, but rather in the class of problems that it can effectively be applied to. From my experience it seems that there are some problem areas, in particular event driven programming such as occurs in programs with an elaborate graphical user interface, where one is led almost automatically into an object-oriented approach. Here most effort is directed towards designing a consistent set of data types, and maintaining the necessary invariants of the data, and a proper correspondence with the visual representation presented to the user; this leads to a plethora of small functions and a natural connection between data types and their methods. However, I have never seen an example of a task of any algorithmic complexity (in the every-day sense of the word complexity) that could be handled effectively by object-oriented techniques; I have really no idea how one could apply these techniques to make programs such as TeX more transparent, or (for a less monolithic example) programs like those in the Stanford GraphBase.

My own experience, which ranges form writing mathematical software (computer algebra) and parts of a compiler to some work on a program with an extensive user interface, suggests that the utility of literate programming techniques increases strongly with algorithmic complexity: for some small functions the code is so obvious that there is nothing to make a sensible comment about, while at the other extreme I have had cases where a full A4 page of commentary was attached to a single assignment statement (it dealt mainly with explaining why nothing more than that was needed to handle all conceivable cases). But regardless of the algorithmic complexity, every large project will need an introduction in pure commentary that explains the task that is being performed, and the global structure that underlies the design; such a description is much more readable on typeset pages than as a huge block comment in the code.

I don't understand what you mean by a modeling technique in point 1. From point 3 it seems to me however that the software you are considering has a very limited algorithmic complexity.

I have been doing quite a lot of work on a system in Dylan lately. (For those who don't know about Dylan, you can think of it as a very fast CLOS with algebraic syntax and optional strong typing.) I don't think that object-orientation does anything to reduce the value of the literate programming approach. In my use of object-oriented languages, I have found that programmers make far too many assumptions about how obvious things will be. Each little tiny element of code makes sense on its own, but its place in the complete system is far more difficult to understand. So what I end up doing in object-oriented literate programming is to provide a lot of documentation about the system, and less about the components. I use a hierarchical structure that matches the basic structure of Dylan programs.

In Dylan, you build code in libraries. Each library consists of a collection of modules, and a declaration that imports modules from other libraries, and exports modules to other libraries. Each module is a private namespace, with imports of identifiers from different modules, and exports of identifiers to different modules. The modules consist of declarations of classes, methods, and variables. So I will start on the top, and provide a description of the design of the whole system that I am building, and how it's divided into libraries. If I have got a system diagram that makes sense at this level, I include it.

Then I provide the library declarations, with documentation about what services the library provides to the system, and how its divided into modules. (The library declaration provides a root chunk for one code file.) Each library is a chapter in the overall document. Then I write a section for each module. The module descriptions are less ordered and structured than the high levels, depending on what order of presentation will make the most sense to a reader. Then there's the chunks describing the different classes and methods. The methods are not usually broken into more than one chunk, except in the relatively rare cases that they're very complicated.

You can do that without using literate programming. But similarly, I can write a program in C which well structured and documented without using literate programming. The trick is that it's a lot harder to keep it well documented when the code and the documentation are separate. Especially when someone changes something in a hurry, and doesn't alter the documentation to match. That's a lot more likely to happen when the documentation is separate.

Marc van Leeuwen writes: Literate programming is not identical with the use of modules (scraps, chunks, ...). In fact these are just a means towards the goal of providing the highest possible level of documentation to the code. They are however very effective in the situation they are intended for, because they achieve two things at once: 1) move code dealing with details that are not discussed in the commentary of the current section outside of its view, so that the connection between the commentary and the (visible) code is more easily perceived; 2) provide a short abstract of the omitted code through its module name. The second point should not be overlooked; sometimes it is so effective that some sections need no commentary, because the module name already says it all (just like some mail messages need no body after stating their subject line). It is also a point that will be missed if one just uses function calls instead of module abstractions as a means of partitioning the code into small pieces; it is not likely that one will use function names with some 100 characters or so, even it the language allows it and the task of the function cannot be described well with fewer characters.

In my view, one should not make functions just to partion other functions into smaller pieces. Use a good editor (emacs springs to mind) or better still use the literate programming paradigm.

Just adding: When you need the application to behave in a dynamic manner (e.g. load and execute different code modules on demand), OOP also makes it a lot easier.

However, I have never seen an example of a task of any algorithmic complexity (in the every-day sense of the word complexity) that could be handled effectively by object-oriented techniques;

Modern DTP programs? FEM (Finite Element Method) programs applied to soil mechanics? (many different types of soil with different characteristics)

I have really no idea how one could apply these techniques to make programs such as TeX more transparent, or (for a less monolithic example) programs like those in the Stanford GraphBase. My own experience, which ranges form writing mathematical software (computer algebra) and parts of a compiler to some work on a program with an extensive user interface, suggests that the utility of literate programming techniques increases strongly with algorithmic complexity: [...] But regardless of the algorithmic complexity, every large project will need an introduction in pure commentary that explains the task that is being performed, and the global structure that underlies the design; such a description is much more readable on typeset pages than as a huge block comment in the code.

Agreed! I too prefer typesetting large portions of text so I wouldn't put introductions etc. into a huge comment block.

I don't understand what you mean by a modeling technique in point 1.

The Rumbaugh and Booch modeling techniques use boxes etc and text to display/reveal the relations between different parts of a system; just like state diagrams, ... The interesting thing about these techniques is that they make use of several different kinds of diagrams and also encourages the use of several diagrams of the same classes/objects/whatever at different levels of abstraction.

From point 3 it seems to me however that the software you are considering has a very limited algorithmic complexity.

You're correct that I thought of small methods (~ scrap size), but just because one uses many small methods/functions (as promoted by OOP) with limited complexity, it doesn't follow that the system in full is of limited complexity.

Absolutely. I find that literate programming is extremely useful for object-oriented work. Regardless of paradigm, most experienced programmers (at least those I have worked with) tend to write programs in little pieces and then put the pieces together to make the compiler happy (which is what the tangling tools do automatically). OOP reduces the amount of juggling you have to do by using objects for encapsulation, but there are still (and will always be) forms in the language that cater to the compiler at the expense of the way in which humans think. The main purpose of literate programming, as I see it, is to enable the programmer to write a program for a human reader, yet still enable a compiler to grab hold of it. Aside from writing programs in natural language (and I don't want to get into that sidebar), the best way seems to be the literate programming chunk technique. As Dijkstra says in "Structured Programming", "I want a program written down as I can understand it. I want it written down as I would like to explain it to someone." That's precisely what literate programming tools provide.

For example, when I develop a class, I always lay out the outline (boilerplate) first:

1. Description of the relationships between the classes/class clusters. Here a suitable modeling technique should be used; e.g. Rumbaugh and/or Booch.

2. How to use the class and each of its methods.

This is more user documentation rather than programmer documentation. On the other hand, I have had quite good results with including manual pages as a detachable part of the web (Norman Ramsey does it even better than I do), so the user docs and the program stay in sync. Again, literate programming is helpful.

3. A description of the algorithm in each method, but normally there will be no need to split the method into more than one scrap.

I disagree here. When I am working with an algorithm I write myself, I find that I now use chunk names in place of pseudo code, and the chunk definitions in place of refinements. Any nontrivial algorithm will require a not insignificant amount of stepwise refinement. In addition, the text chunks allow me to reason explicitly about my algorithm and prove (at least informally) that the algorithm is correct. I can then demonstrate that the code is correct by showing that it matches the text description. On the other hand, when I already have an algorithm (from a book, perhaps), I find that literate programming allows me to translate that algorithm almost directly (and therefore correctly). I have an example program that uses Knuth's topological sort algorithm in just this way, but I don't have it readily available.

We can meet these requirements without using the literate programming paradigm.

True, but we can do any programming without literate programming techniques. Literate programming simply enhances the readability and maintainability of our code, regardless of which paradigm we use.

I disagree with this paragraph entirely. If programmers were perfect, your statement about the analysis phase would probably be correct, but we're not, and we can't always foresee the future with perfect clarity (even with respect to program development). Similarly, I think there is a significant difference between standard commenting practices (however good) and literate programs. A program has 2 audiences, the human reader and the compiler, who have different needs. Standard comments make a compromise that loses the ability to use exceptionally human-related explanations (such as charts, graphs, indexes, etc.) in order to allow the documentation and code to coexist in the same file. Literate programming makes that compromise unecessary.

I doubt that I will use a literate programming tool in future projects involving OO-languages but I will most certainly document my work to a much larger extent than before I learned of literate programming and realized just how much was missing from my previous documentations.

I am sorry to hear that, but of course, each of us must work as he (or she) thinks best. I suspect, though, that you will find that, even with the best standard documentation you can do, you will still miss literate programming.

I think the view of many is that literate and OO programming are basically orthogonal -- both have benefits and can be usefully combined. However, if one is to adopt the view of any of several OO design methodologists such as Grady Booch or Jim Rumbaugh, there are aspects of OO design which have no direct implementation in any existing language: concepts such as Booch's class category. Literate programming offers a method of implementing some of these design abstractions (such as the class category) directly -- even without regard to the underlying OO language. I therefore feel that there exists a synergy between the two techniques that is greatly under-utilized.

Hear, hear! I have felt this very strongly. The religious fervor of the OO true believers seems utterly misplaced to me: there are only a few new ideas there, and (as Marc said earlier in his article) they only really shine in a few problem areas, ones which are uninteresting to me. Of course, they do harmonize with the current belief that computers primarily draw pictures.

From:	Jacob Nielsen
Date:	14 May 1995

From:	Marc van Leeuwen
Date:	15 May 1995

From:	Mark Chu-Carroll
Date:	16 May 1995

From:	Jacob Nielsen
Date:	16 May 1995

From:	Lee Wittenberg
Date:	16 May 1995

Is literate programming useful for object-oriented programming?