The room fills with cold, conditioned air; outside the heat hazes, fil- tered through greened glass windows: a new building hardly first- populated. The speaker is wild-eyed, explaining new ideas like a Bible thumper. His hair is a flat-top; his mouth frowns in near gri- mace. He strides to my seat, looks down and says in a Texas drawl, “and the key is simply this: Abstractions. New and better abstrac- tions. With them we can solve all our programming problems.”
(Gabriel and Steele 1990)
This scene, which occurred in the late 1980s, began my off-and-on inquiry into the good and bad points of abstraction. All through my computer science educa- tion, abstraction was held up as the bright, shining monument of computer sci- ence. Yet when discussed in essays on the philosophy of science, abstraction is routinely questioned. Here is what Paul Feyerabend says about Ernst Mach, scien- tist and philosopher:
We have seen that abstraction, according to Mach, “plays an impor- tant role in the discovery of knowledge.” Abstraction seems to be a negative procedure: real physical properties . . . are omitted. Abstrac- tion, as interpreted by Mach, is therefore “a bold intellectual move.” It can misfire, it “is justified by success.” (Feyerabend 1987)
Throughout the book abstraction is discussed in terms of its relation to reality and to the “interests” of those who develop them. Here is what Bas Van Fraassen says:
But in certain cases, no abstraction is possible without losing the very thing we wish to study. . . .Thus [study at a certain level] can only go
so far—then it must give way to less thorough abstraction (that is, a less shallow level of analysis). (Van Fraassen 1980)
Both Van Fraassen and Feyerabend seem to subscribe to the notion that abstrac- tion in the realm of science—I doubt they would include computer science—is about ignorance. Abstraction ignores or omits certain things and operates at a shallow level of analysis.
The following is the definition of abstraction in computer science that I’ve used for years:
Abstraction in programming is the process of identifying common patterns that have systematic variations; an abstraction represents the common pattern and provides a means for specifying which vari- ation to use.
An abstraction facilitates separation of concerns: The implemen- tor of an abstraction can ignore the exact uses or instances of the abstraction, and the user of the abstraction can forget the details of the implementation of the abstraction, so long as the implementation fulfills its intention or specification. (Balzer et al. 1989)
This definition does not directly mention ignorance or omission, though it does imply them. The common pattern is omitted—in most types of abstraction the common pattern is replaced by a name, such as function name, a macro name, or a class name. The structure and behavior of the common pattern is lost except that the name of the abstraction denotes it and the definition of the abstraction (its implementation) contains it. And the foregoing definition does mention the interests of particular people—abstraction implementers and abstraction users.
Although I will return to the themes brought out by Feyerabend and Van Fraassen later, first I will set the scene. In the first two essays in this book I looked at three new concepts for programming: compression, habitability, and piecemeal growth, defined as follows:
Compression is the characteristic of a piece of text that the meaning of any part of it is “larger” than that particular piece has by itself. This characteristic is created by a rich context, with each part of the text drawing on that context—each word draws part of its meaning from its surroundings.
Habitability is the characteristic of source code that enables pro- grammers, coders, bug-fixers, and people coming to the code later in its life to understand its construction and intentions, and to change it comfortably and confidently.
Piecemeal growth is the process of design and implementation in which software is embellished, modified, reduced, enlarged, and improved through a process of repair rather than of replacement.
Habitability can be limited by various things; abstraction being one of them. Many of my comments about abstraction also apply to encapsulation, which is a strong variety of abstraction. But don’t panic: Abstraction and encapsulation are good things; my point is that they should be used only in moderation.
One bad effect of abstraction on programs occurs when it is taken too far. This results in code that is structured like a big pyramid, in which one level of abstrac- tions is built tightly on another with minimal interfacial glue. Wait, that’s not right—abstractionists would have you believe that abstraction guarantees you’ll build geometrically precise programs, but that’s garbage. Even in single inherit- ance object systems (in which hierarchies look like pyramids), there are other use relationships besides inheritance. Let’s try this paragraph again.
One bad effect of abstraction on programs occurs when abstraction is taken too far. This results in code that is structured like a big web, in which groups of abstractions are tightly interwoven with others by means of interfacial glue.
In such large tangles lies one of the problems. If one abstraction is used in many places and that abstraction’s interface is wrong, then repairing it forces repair of all its uses. The more the abstraction is shared, the more repair will be needed. But such repair is dangerous because it is usually made with less delibera- tion than the original design and in the context of unexpected requirements. If the repairs are being made by people other than the original designers and much later than the design and original implementation, the likelihood of mistake or ugliness will be increased.
Such webs are examples of compression: The meaning of an expression written in the context of the web is determined by the contents of the entire web. If you need to change another part of the web, your compressed expression might change its meaning, what it does, or even whether it works at all. So when we build our tight inheritance hierarchy in object-oriented fashion—weblike or pyr- amid style—we might be falling into this trap.
The reason for this failure is the insistence on using abstractions throughout, at all levels. If, instead, there were a level at which the code became direct rather than indirect, this might be less likely to happen. The problem is that people are taught to value abstraction above all else, and object-oriented languages and their philosophy of use emphasizes reuse (compression), which is generally good. However, sometimes the passion for abstraction is so strong that it is used inap- propriately—it is forced in the same way as it is with larger, more complex, and typically ad hoc abstractions.
Abstractions must be carefully and expertly designed, especially when reuse or compression is intended. However, because abstractions are designed in a partic-
ular context and for a particular purpose, it is hard to design them while antici- pating all purposes and forgetting all purposes, which is the hallmark of the well- designed abstractions.
This implies that abstractions are best designed by experts. Worse, average programmers are not well-equipped to design abstractions that have universal usage, even though the programming languages used by average programmers and the programming language texts and courses average programmers read and attend to learn their trade emphasize the importance of doing exactly that. Although the designers of the programming language and the authors of texts and course instructors can probably design abstractions well, the intended audi- ence of the language—average programmers—cannot and are therefore left out. That is, languages that encourage abstraction lead to less habitable software, because its expected inhabitants—average programmers working on code years after the original designers have disappeared—are not easily able to grasp, mod- ify, and grow the abstraction-laden code they must work on.
Not everyone is a poet, but most anybody can write usable documentation for small programs—we don’t expect poets to do this work. Yet we seem to expect that the equivalent of poets will use high-level programming languages, because only program-poets are able to use them. In light of this observation, is it any wonder that abstraction-poor languages like C are by far the most popular and that abstraction-rich ones like Lisp and Smalltalk are niche languages?
Recall that one of the virtues of abstraction is that “the user of the abstraction can forget the details of the implementation of the abstraction.” Nonetheless, the interface of a complex abstraction is likely to expose the implementation because portions of the abstraction’s interface are likely to be dedicated to accessing par- ticular behavior. Inexpertly designed and ad hoc abstractions frequently suffer from this shortcoming. Once the implementation is even partially revealed, it becomes more difficult to change the abstraction’s implementation without caus- ing problems. If abstraction were limited to simple cases, especially those with a fairly universal meaning—like the well-loved and ubiquitous stack, a favorite of almost every paper on abstract types—this problem would be reduced.
Another problem with complex abstraction arises from the observation that
abstractions are about ignorance. The prime idea of encapsulation is that the implementation is hidden, thereby preventing assumptions about the implemen- tation. Some complex abstractions, however, contain information about the implementation that is legitimately required, such as its performance, the algo- rithm, coding tricks, and resource usage—keep in mind that almost all interac- tion issues are about resource conflicts. When the implementation is truly hidden, its user is forced to use real scientific methods to infer or discover the needed information. Rather than preventing assumptions, hard encapsulation tends to guarantee incorrect assumptions.
Furthermore, when the implementation is truly hidden and there is a need to make a change or add behavior, the user is reduced to reinventing; if the imple- mentation is not only hidden but also protected, the user will need to copy and maintain a parallel version. This implies that later changes to the code will be less efficiently made because similar code must be implemented in several places. In fact, the programmer—especially a programmer new to the project—may be unable to find all the occurrences of the similar code.
Notice the dilemma: Complex abstractions sometimes reveal implementation, which limits the opportunity to change their implementations, and they also are intended to hide implementation, which forces programmers (inhabitants) into ignorance and its resulting frustration, ineffectiveness, and feelings of denied responsibility. Strict abstractionists would argue that it is far better to hide the implementation, but consider what Christopher Alexander said:
When people lose the sense of responsibility for the environment they live in, and realize that they are merely cogs in someone else’s machine, how can they feel any sense of identification with the com- munity, or any sense of purpose there? (Alexander 1975)
Why put your programmers through this?
Most of these problems can be eliminated or reduced if the user is encouraged to build small abstractions only. This is easier if the language provides other mechanisms that help the programmer build the larger structures needed in a program, structures that would normally be constructed from abstractions.
When we couple this advice with that of building hierarchies slowly, keeping them shallow as long as possible, we might find that we have paved the way for habitability, piecemeal growth, and healthy, usable compression. In fact, if we put in place an explicit process of piecemeal growth, hierarchies will naturally grow slowly, and they are more likely to be correct, because their use evolves over time. The resulting compression also is natural, not forced, and it is less likely that the compression will backfire.
This same advice is more cynically arrived at by Martin D. Carroll and John F. Isner when discussing the design of the C++ Standard Components developed by UNIX Systems Laboratories:
[W]e take the minimalist approach to inheritance. We use it only when it makes our components more efficient, or when it solves cer- tain problems in the type system.
We do not intend for our components to serve as a collection of base classes that users extend via derivation. It is exceedingly difficult to make a class extensible in abstracto (it is tenfold harder when one is trying to provide classes that are as efficient as possible). Contrary to a common misconception, it is rarely possible for a programmer to
take an arbitrary class and derive a new, useful, and correct subtype from it, unless that subtype is of a very specific kind anticipated by the designer of the base class. Classes can only be made extensible in certain directions, where each of these directions is consciously chosen (and programmed in) by the designer of the class. Class libraries which claim to be “fully extensible” are making an extrava- gant claim which frequently does not hold up in practice. . . . There is absolutely no reason to sacrifice efficiency for an elusive kind of “extensibility.” (Carroll and Isner 1992, emphasis added)
Their argument comes more from efficiency than from the needs of habitabil- ity and piecemeal growth; in fact, their whole essay is about how to achieve what they consider the most important characteristic of a library—its efficiency. But notice that they make a very strong claim that being able to extend a class hierar- chy is nearly impossible, though they don’t say why precisely. One can try to puz- zle out the reason; the best I can come up with is that many possible extensions require access to the internal implementation of a class, and the author of the class either does not know what parts are needed or wishes to retain the opportu- nity to change the implementation and therefore hides the implementation. In
C++ there are other reasons, like the need for speed encourages class authors to make virtual as little as possible.
Regardless of the reasons behind it, if we accept Carroll and Isner’s statement that unanticipated extension is difficult, their statement supports my claim that during the growth of an abstraction web, one must frequently go back to existing abstractions to repair them, such repair being a sort of delayed anticipation of the extensions.
Because abstracting a common pattern and using the abstraction is a process of replacing something directly expressed by a shorthand, adding an abstraction is like adding a new word to a real language: It requires learning a new vocabulary. Real languages rarely permit us to add new words, and such words are reserved for concepts, objects, actions, and qualifications that are basic or newly basic. We don’t invent new words for dogs that can jump through hoops or automobiles modified to carry 300 gallons of fuel. First, there is no need for these words except within a small familylike circle, and even there the need soon dissipates when the name of the dog or a phrase like “Joe’s big fueler” prove worthy and effective.
Second, these words would be meaningless outside the small circle that might need them, and encountering an enclave of odd-language users would negate the advantages of natural language. Similarly, we want to limit the use of abstraction to either common items—common to every programmer—or items inextricably linked to the application domain. Otherwise, unnecessarily abstracted code would be unintelligible to programmers fresh to a project, and hence it would not be habitable.
Van Fraassen hints at this idea: “But in certain cases, no abstraction is possible without losing the very thing we wish to study.” He means that the object of inter- est is captured only by the name of the abstraction rather than by the thing itself. For example, when we speak of redness and go no deeper, redness disappears and all we have left is the name red.
Recall that one of the primary reasons that abstraction is overloved is that a completed program full of the right abstractions is perfectly beautiful—it is “jus- tified by success” as Mach said. Of course, Christopher Alexander would probably not think such a program was beautiful—assuming he would even know how to read a program. Note what he said about completed, planned urban develop- ment:
[P]lanned development is also generally not coherent . . . not in a deep-felt sense. It is supposed to be. But if we ask ourselves whether the final product of current urban design projects actually is coherent in the real, deep sense that we know from traditional towns, then we must say no. The order is superficial, skin deep, only in the plan or in some contrived orderliness of the arrangements. There is no deep inner coherence, which can be felt in every doorway, every step, and every street.
And . . . this modern planned development which we think of as normal, certainly has NO power to evoke deep feeling. It can, at best, ask for some kind of admiration for “design.” But of deep feeling there is no word, not a tremor, not a possibility. (Alexander 1987)
Alexander’s area of study—architecture and urban development—has the advantage of having had a long history and, hence, examples of the sort of habit- able, deeply felt homes and towns and cities with which he can contrast the prod- ucts of the modern design approach. In software there are no large examples that I can point to with which we are all familiar. And so we can ask whether Alex- ander’s words—which sound so nice when apply them to the field of software— pertain in any real sense to computer science. But that is a question for another essay.
Remember, piecemeal growth is the norm, and programs grow, change, and are repaired. Therefore, the perfectly beautiful program is possible only for very small programs, like the ones in books on programming or in programming courses.
Now let’s return to my definition of abstraction. One thing that strikes me about this definition, now that I have used it for years and years, is the degree to