As Brian Kernighan once observed, “Controlling complexity
is the essence of computer programming” [Kernighan-Plauger]. Debugging dominates
development time, and getting a working system out the door is
usually less a result of brilliant design than it is of managing not
to trip over your own feet too many times.
Assemblers, compilers, flowcharting, procedural programming,
structured programming, “artificial intelligence”,
fourth-generation languages, object
orientation, and software-development
methodologies without number have been touted and sold as a cure for
this problem. All have failed as cures, if only because they
‘succeeded’ by escalating the normal level of program
complexity to the point where (once again) human brains could barely
cope. As Fred Brooks famously observed [Brooks], there is no silver bullet.
The only way to write complex software that won't fall on its
face is to hold its global complexity down — to build it out of
simple parts connected by well-defined interfaces, so that most
problems are local and you can have some hope of upgrading
a part without breaking the whole.
It's hard to avoid programming overcomplicated monoliths if none
of your programs can talk to each other.
Unix tradition strongly encourages writing programs that
read and write simple, textual, stream-oriented, device-independent
formats. Under classic Unix, as many programs as possible are written
as simple filters, which take a simple text
stream on input and process it into another simple text stream on
output.
Despite popular mythology, this practice is favored not because
Unix programmers hate graphical user interfaces. It's because if you
don't write programs that accept and emit simple text streams, it's
much more difficult to hook the programs together.
Text streams are to Unix tools as messages are to objects in an
object-oriented setting. The simplicity of the text-stream interface
enforces the encapsulation of the tools. More elaborate forms of
inter-process communication, such as remote procedure calls, show a
tendency to involve programs with each others' internals too
much.
To make programs composable, make them independent. A program
on one end of a text stream should care as little as possible about
the program on the other end. It should be made easy to replace one
end with a completely different implementation without disturbing the
other.
GUIs can be a very good thing. Complex binary data formats are
sometimes unavoidable by any reasonable means. But before writing a
GUI, it's wise to ask if the tricky interactive parts of your
program can be segregated into one piece and the workhorse
algorithms into another, with a simple command stream or
application protocol connecting the two. Before devising a tricky
binary format to pass data around, it's worth experimenting to see
if you can make a simple textual format work and accept a little
parsing overhead in return for being able to hack the data stream
with general-purpose tools.
When a serialized, protocol-like interface is not natural for the
application, proper Unix design is to at least organize as many of the
application primitives as possible into a library with a well-defined
API. This opens up the possibility that the application can be called
by linkage, or that multiple interfaces can be glued on it for different tasks.
(We discuss these issues in detail in Chapter 7.)
In our discussion of what Unix gets wrong, we observed that the
designers of X made a basic decision to implement
“mechanism, not policy”—to make X a generic
graphics engine and leave decisions about user-interface style to
toolkits and other levels of the system. We justified this by
pointing out that policy and mechanism tend to mutate on different
timescales, with policy changing much faster than mechanism. Fashions
in the look and feel of GUI toolkits may come and go, but raster
operations and compositing are forever.
Thus, hardwiring policy and mechanism together has two bad
effects: It makes policy rigid and harder to change in response to user
requirements, and it means that trying to change policy has a strong
tendency to destabilize the mechanisms.
On the other hand, by separating the two we make it possible to
experiment with new policy without breaking mechanisms. We also make
it much easier to write good tests for the mechanism (policy,
because it ages so quickly, often does not justify the investment).
This design rule has wide application outside the GUI
context. In general, it implies that we should look for ways to
separate interfaces from engines.
One way to effect that separation is, for example, to write
your application as a library of C service routines that are driven by an
embedded scripting language, with the application flow of control
written in the scripting language rather than C. A classic example of
this pattern is the Emacs editor, which uses
an embedded Lisp interpreter to control editing
primitives written in C. We discuss this style of design in Chapter 11.
Another way is to separate your application into cooperating
front-end and back-end processes communicating through a specialized
application protocol over sockets; we discuss this kind of design in Chapter 5 and Chapter 7. The front end implements policy; the
back end, mechanism. The global complexity of the pair will often be
far lower than that of a single-process monolith implementing the same
functions, reducing your vulnerability to bugs and lowering life-cycle
costs.
Many pressures tend to make programs more complicated (and
therefore more expensive and buggy). One such pressure is technical
machismo. Programmers are bright people who are (often justly) proud
of their ability to handle complexity and juggle abstractions. Often
they compete with their peers to see who can build the most intricate
and beautiful complexities. Just as often, their ability to design
outstrips their ability to implement and debug, and the result is
expensive failure.
| | The notion of “intricate and beautiful complexities” is
almost an oxymoron. Unix programmers vie with each other
for “simple and beautiful” honors — a point that's
implicit in these rules, but is well worth making overt. | |
| --
Doug McIlroy
| |
Even more often (at least in the commercial software world)
excessive complexity comes from project requirements that are based on
the marketing fad of the month rather than the reality of what
customers want or software can actually deliver. Many a good design
has been smothered under marketing's pile of “checklist
features” — features that, often, no customer will ever
use. And a vicious circle operates; the competition thinks it has to
compete with chrome by adding more chrome. Pretty soon, massive bloat
is the industry standard and everyone is using huge, buggy programs
not even their developers can love.
Either way, everybody loses in the end.
The only way to avoid these traps is to encourage a software
culture that knows that small is beautiful, that actively resists
bloat and complexity: an engineering tradition that puts a high
value on simple solutions, that looks for ways to break program systems up
into small cooperating pieces, and that reflexively fights attempts to
gussy up programs with a lot of chrome (or, even worse, to design
programs around the chrome).
That would be a culture a lot like Unix's.
Software is said to be robust when
it performs well under unexpected conditions which stress the designer's
assumptions, as well as under normal conditions.
Most software is fragile and buggy because most programs are too
complicated for a human brain to understand all at once. When you
can't reason correctly about the guts of a program, you can't be sure
it's correct, and you can't fix it if it's broken.
It follows that the way to make robust programs is to make their
internals easy for human beings to reason about. There are two main
ways to do that:
transparency and
simplicity.
| | For robustness, designing in tolerance for unusual or extremely
bulky inputs is also important. Bearing in mind the Rule of
Composition helps; input generated by other programs is notorious for
stress-testing software (e.g., the original Unix C compiler reportedly
needed small upgrades to cope well with Yacc output). The forms
involved often seem useless to humans. For example, accepting empty
lists/strings/etc., even in places where a human would seldom or never
supply an empty string, avoids having to special-case such situations when
generating the input mechanically. | |
| --
Henry Spencer
| |
One very important tactic for being robust under odd inputs is
to avoid having special cases in your code. Bugs often lurk in the
code for handling special cases, and in the interactions among parts
of the code intended to handle different special cases.
We observed above that software is
transparent when you can look at it and
immediately see what is going on. It is simple
when what is going on is uncomplicated enough for a human brain to
reason about all the potential cases without strain. The more
your programs have both of these qualities, the more robust they will
be.
Modularity (simple parts, clean interfaces) is a way to organize
programs to make them simpler. There are other ways to fight for
simplicity. Here's another one.
Software should be
transparent in the way that it fails, as well
as in normal operation. It's best when software can cope with
unexpected conditions by adapting to them, but the worst kinds of bugs
are those in which the repair doesn't succeed and the problem quietly
causes corruption that doesn't show up until much later.
Therefore, write your software to cope with incorrect inputs and
its own execution errors as gracefully as possible. But when it
cannot, make it fail in a way that makes diagnosis of the problem as
easy as possible.
Consider also Postel's Prescription:[10]
“Be liberal in what you accept, and conservative in what you
send”. Postel was speaking of network service programs, but the
underlying idea is more general. Well-designed programs cooperate
with other programs by making as much sense as they can from
ill-formed inputs; they either fail noisily or pass strictly clean and
correct data to the next program in the chain.
However, heed also this warning:
| | The original HTML documents recommended “be generous in
what you accept”, and it has bedeviled us ever since because
each browser accepts a different superset of the specifications. It
is the specifications that should be generous,
not their interpretation. | |
| --
Doug McIlroy
| |
McIlroy adjures us to design for
generosity rather than compensating for inadequate standards with
permissive implementations. Otherwise, as he rightly points
out, it's all too easy to end up in tag soup.
In the early minicomputer days of Unix, this was still a fairly
radical idea (machines were a great deal slower and more expensive
then). Nowadays, with every development shop and most users (apart
from the few modeling nuclear explosions or doing 3D movie
animation) awash in cheap machine cycles, it may seem too obvious
to need saying.
Somehow, though, practice doesn't seem to have quite caught up
with reality. If we took this maxim really seriously throughout
software development, most applications would be written in
higher-level languages like Perl, Tcl, Python, Java,
Lisp and
even shell — languages that ease the programmer's burden by
doing their own memory management (see [Ravenbrook]).
And indeed this is happening within the Unix world, though
outside it most applications shops still seem stuck with the
old-school Unix strategy of coding in C (or C++). Later in this book
we'll discuss this strategy and its tradeoffs in detail.
One other obvious way to conserve programmer time is to teach
machines how to do more of the low-level work of programming. This
leads to...
The most basic argument for prototyping first is Kernighan &
Plauger's; “90% of the functionality delivered now
is better than 100% of it delivered never”. Prototyping first
may help keep you from investing far too much time for marginal
gains.
For slightly different reasons, Donald
Knuth
(author of The Art Of Computer Programming, one
of the field's few true classics) popularized the observation that
“Premature optimization is the root of all
evil”.[11]
And he was right.
Rushing to optimize before the bottlenecks are known may be the
only error to have ruined more designs than feature creep. From
tortured code to incomprehensible data layouts, the results of
obsessing about speed or memory or disk usage at the expense of
transparency and simplicity are everywhere. They spawn innumerable
bugs and cost millions of man-hours — often, just to get marginal
gains in the use of some resource much less expensive than
debugging time.
Disturbingly often, premature local optimization actually
hinders global optimization (and hence reduces overall performance).
A prematurely optimized portion of a design frequently interferes with
changes that would have much higher payoffs across the whole design,
so you end up with both inferior performance and excessively complex
code.
In the Unix world there is a long-established and very explicit
tradition (exemplified by Rob Pike's comments above and Ken
Thompson's
maxim about brute force) that says: Prototype, then
polish. Get it working before you optimize it. Or: Make it
work first, then make it work fast. ‘Extreme programming' guru
Kent Beck,
operating in a different culture, has usefully amplified this to:
“Make it run, then make it right, then make it
fast”.
The thrust of all these quotes is the same: get your design
right with an un-optimized, slow, memory-intensive implementation
before you try to tune. Then, tune systematically, looking for the
places where you can buy big performance wins with the smallest
possible increases in local complexity.
| | Prototyping is important for system design as well as
optimization — it is much easier to judge whether a prototype
does what you want than it is to read a long specification. I
remember one development manager at Bellcore who fought against the
“requirements” culture years before anybody talked about
“rapid prototyping” or “agile development”. He
wouldn't issue long specifications; he'd lash together some combination of
shell scripts and awk code that did roughly what was needed, tell the
customers to send him some clerks for a few days, and then have the
customers come in and look at their clerks using the prototype and
tell him whether or not they liked it. If they did, he would say
“you can have it industrial strength so-many-months from now at
such-and-such cost”. His estimates tended to be accurate, but
he lost out in the culture to managers who believed that requirements
writers should be in control of everything. | |
| --
Mike Lesk
| |
Using prototyping to learn which features you don't have to
implement helps optimization for performance; you don't have to
optimize what you don't write. The most powerful optimization tool in
existence may be the delete key.
| | One of my most productive days was throwing away 1000 lines of
code. | |
| --
Ken Thompson
| |
(We'll go into a bit more depth about related ideas in Chapter 12.)
If it is unwise to trust other people's claims for “one
true way”, it's even more foolish to believe them about your
own designs. Never assume you have the final answer. Therefore,
leave room for your data formats and code to grow; otherwise, you
will often find that you are locked into unwise early choices because
you cannot change them while maintaining backward compatibility.
When you design protocols or file formats, make them sufficiently
self-describing to be extensible. Always, always
either include a version number, or compose the format from
self-contained, self-describing clauses in such a way that new clauses
can be readily added and old ones dropped without confusing
format-reading code. Unix experience tells us that the marginal
extra overhead of making data layouts self-describing is paid back
a thousandfold by the ability to evolve them forward without
breaking things.
When you design code, organize it so future developers
will be able to plug new functions into the architecture without
having to scrap and rebuild the
architecture.
This rule is not a license to add features you don't yet need; it's
advice to write your code so that adding features later when you
do need them is easy.
Make the joints flexible, and put “If you ever need
to...” comments in your code. You owe this grace to people who
will use and maintain your code after you.
You'll be there in the future too, maintaining code you may have
half forgotten under the press of more recent projects. When you
design for the future, the sanity you save may be your own.
评论排行榜