Overview
Murphy's revenge: The more reliable you make a system, the longer it will take you to figure out what's wrong when it breaks.
Sean Donelan, on the NANOG mailing list
Sean Donelan's comment was part of a long thread discussing a massive failure of British Telecom's IP backbone on November 21, 2001. This was neither the first nor the last large network outage. Much of the discussion made reference to other "great" outages that have occurred; a continuing theme was the problems of adequately testing new software in the lab. No matter how many things you stress, the actual deployed environment is always worse. That's when the bugs show up.
System Is Not a Dirty Word
Or at least it needn't be. We grow up in the educational system, and we may think that is an unpromising beginning. Yet, at the end of our formal education, depending on the system in our childhood home, we really are rather well prepared for adulthood. Of course, we could be better prepared; we can easily see that, now—in 20/20 hindsight.
Some educational systems prepare children with a great store of knowledge, imparted with great rigor from a well−versed instructor. Others do less imparting and adopt a more free−wheeling approach, resulting in (perhaps) a less detailed knowledge set but a more flexible thinking process. Systems are incomplete; they include trade−offs.
Perhaps a brief step back is in order—in Chapter 2, "Network Threats," I brought up the idea that a system has independent parts that operate interdependently to make a product or process that is more than just the collection of independent results. It is the interoperation that makes the difference. As the great science fiction author Robert A. Heinlein pointed out frequently, there ain't no such thing as a free lunch (often abbreviated as TANSTAAFL); that is the trade−off part. When NASA introduced the idea of making new planetary probes that were "better, faster, cheaper," wags immediately (and some would say, realistically) added "... pick any two."
Not all components can be maximized or even individually optimized; interference occurs (Carl von Clausewitz called this "friction" in the context of war fighting). When it does, you must choose which performance criteria are more important to you at this time, knowing that this set is fungible. Which criteria you choose will depend on which problems have a higher priority to your business—which is not necessarily the same as those you would choose solely for network performance optimization. Before we can apply those principles to our network security, there is one more topic to address, reflected in the existence of systemic behavior in networks as well as individual computers. Bruce Schneier points out four important characteristics of systems: complexity, interaction, emergent properties, and bugs. We will meet these characteristics over and over again in the process of creating networks that can survive to perform their function. Individually, the properties are all partial reflections of systemic behavior; like the blind men and the elephant, each describes one aspect of a whole that is very much more.
Complexity
Systems are always complex. They have multiple parts, many or most of which have been characterized only by their independent behavior. The parts are put together in some sort of designed structure or structures, which, in turn, have intended input and output relationships. But because of a system's many parts, its final behavior is not always obvious; in fact, it is often not only obscure, but surprising (or even surprisingly obscure).
Interaction
Interaction is the norm when dealing with systems. Part of the reason for the unexpected behaviors cited previously is that the system's components interact in surprising ways. In a fractal−like behavior, the systems themselves also interact in surprising ways, creating a higher order of system with unexpected behaviors. This level of system then becomes a component in an even higher−order system (at some point, one is tempted to call it a supersystem, but next week that could be overcome by yet another overlay of a yet higher level of system).
As an example, we can move from a single computer to a LAN; the single computer as a LAN component has different characteristics (such as file sharing and the ability to cause another component to fail) from those it has as a component (single unit). The LAN becomes one segment of a building−wide network, which in turn is but one section of a campus network, which is a unit (connected by a WAN) of a continental corporate network, which is a portion of the global network.... We could also scale the connection to a service provider's network, which connects to the Internet. And that will someday be a component of the Interplanetary Internet (planning is already underway; see http://www.ipnsig.org/).
Emergent Properties
Emergent properties does not mean that systems emerge full−blown from something else (like a moth from a cocoon). Rather, it means their behaviors emerge in unanticipated (which is to say, unplanned) directions. Inexpensive automobiles led to a change in the character of American cities—an emergent property that was certainly unexpected by early pioneers such as Henry Ford and Charles Olds. Competitive long distance telephone rates have led to a drastic drop in letter writing, which has significantly reduced the volume of first−class mail, necessitating an increase in postal rates to make up for the lost revenue. And, of course, air conditioning made Washington, D.C. tolerable enough year round to allow continuous governance at the federal level. Emergent properties are surprises; not all of them are pleasant. They result from the interactions of components operating normally but with unexpected interactive effects.
Bugs
Bugs, unfortunately, are the fourth property of systems. In a sense, a bug is an emergent property, but it is one that occurs from a design failure, rather than one that emerges when all components are operating as planned. To make problem solving more interesting, the appearance of the bug may depend on a complex set of conditions that cannot be exactly replicated. Without replication, it will most likely remain a mystery, casting its reporter in a dubious light.
In mathematics, this kind of behavior is often described as "sensitive dependence on initial conditions" and is a description of what has been popularized as chaos. Perfect predictability is possible if a complete description of the initial conditions is given. Unfortunately, that is generally either impossible or more expensive than the value gained from predictability.
Bugs then are inevitable. And known bugs (or defects, or deficiencies, or weaknesses) are targeted by your attackers.
Where Opportunity Knocks
Network administrators are often admonished by management to "fix the holes in the network." Unfortunately, there are thousands of possible holes to fix and limited time and other resources with which to fix them. The Common Vulnerabilities and Exposures (CVE) project at Mitre (http://cve.mitre.org/cve/) lists 1,604 accepted CVE and another 1,796 candidates for CVE status (in its update of September 18, 2001). The U.S. National Institute of Standards and Technology (NIST) maintains a metabase of CVE. As of its October 30, 2001, version, its searchable engine had records available on 3,095 CVE.
In the interest of putting the most effort where it will do the most good, the SANS Institute (http://www.sans.org/), in cooperation with the FBI's National Infrastructure Protection Center (NIPC), developed a list of what it called the "Top Twenty" (originally the Top Ten, but more were needed)—the Twenty Most Critical Internet Security Vulnerabilities. These vulnerabilities are the target of a large majority of network attacks.
I prefer not to get in the middle of the UNIX−Windows religious wars; I find advantages and disadvantages to both networking systems. There are also significant security problems with both, as you shall see. SANS groups its Top Twenty vulnerabilities as General (of which there are 7 on the November 15, 2001, list), Windows (6), and UNIX (7). General vulnerabilities are not specific to a particular operating system, or OS. Another way to look at these is that they affect all systems. Even if your company has a pure Windows environment, you have at least 13 serious vulnerabilities to address. If you're pure UNIX, you have 14. And if, like most large businesses, you have some of each, you have 20 you must consider (at a minimum—remember, there are literally thousands more). An automated scanner is available (via hyperlink from SANS or at http://www.cisecurity.org/) to check for these; once you have covered your existing systems, you may wish to set up a recheck schedule for any system that has access to the Internet (which is probably most, if not all, of your systems).
Top General Vulnerabilities
It makes no difference what operating system or combination of operating systems you run; you are vulnerable to these seven general vulnerabilities.
Warning These are far more easily addressed when systems are installed for the first time than by patching afterward. If you choose to work at closing these openings after the fact, it will be expensive in system downtime. Of course, rebuilding a violated and crashed system is expensive, too. And the legal costs of sensitive information having been exposed when it should never have been accessible can be considerable as well.
#1: Default Installations
Default installations are designed to be easy and quick; easy makes it less likely that a user or an inexperienced administrator will leave out anything significant or that might be useful later. You could call it the kitchen−sink approach: Install everything they're ever likely to need or want, and throw in the kitchen sink, too, just in case. That way, when the customer needs another feature, it does not matter that he or she has lost (or misplaced, to be generous) the installation CD. With a custom installation, on the other hand, the user selects which features or feature packages to install.
When others are needed, the user/admin must find the original CD and perform another custom installation. A standard or default installation is intended to protect these people from their own ignorance, but at a price.
The problem with this installation is that neither the user nor the administrator has any idea what is installed—and therefore what might need to be patched when a security update is announced. Default installations almost always include services that are extraneous to many users; those services often have open ports through which an attacker can enter.
Especially problematical are Web server installation scripts. Along with a large suite of services, sample scripts are often installed as well; these scripts are virtually never designed with the degree of care that the rest of the program receives. They are especially prone to buffer overflow attacks. To protect against being vulnerable, even through a program designer's noble intentions, install only the minimum services required, and think carefully about what those really are. The Center for Internet Security (the same people who provide the joint survey with the FBI of the damage done by poor security) have compiled a consensus benchmark for a minimum security installation of Windows 2000 and Solaris. This is based on the real experiences of more than 170 organizations, from several countries around the world.
BUFFER OVERFLOWS MEET MURPHY'S LAW
A buffer overflow is self−descriptive: A buffer has received more information than it can handle. Buffer overflows are commonly used to attack systems and gain privileged access (this is a software design problem that should not exist, but does, in all operating systems).
Murphy's Law, of course, tells us that whatever can go wrong, will. A corollary is that it will go wrong in the worst possible way. When an input buffer for a process overflows (for instance, it receive 536 bytes of data for a 512−byte storage space), it generally allows the excess data to overwrite the next space in memory. 24 bytes is probably not too dangerous, but the buffer overflow can be far larger than our example. In fact, creating buffer overflows takes careful programming based on detailed knowledge of the OS in order to get the right information to flow over into the right spots in memory. Implementing a buffer overflow requires running a readily available script.
The result of a buffer overflow attack is often a completely compromised host−the attacker has gained the most powerful set of privileges available to this host. The highly skilled script creator rarely wastes time creating a tool that does not return the best possible value for his or her investment of time and knowledge.
If you are going to start revising your concept of how to install software, this is a good place to begin.
#2: Accounts with Weak/No Passwords
Yes, Virginia, such accounts still exist. The most common account password in North America is "password" (as previously mentioned). Many programs come with a default password on an administrator−level/root−level account. Hackers know these accounts and passwords, and those are the first accounts they test.
Note that I did not say active, only available; when someone leaves the company, voluntarily or otherwise, is that person's account deactivated? Everywhere? Can you be sure, because you have good records of every account created on every host on your entire network? What about the default accounts created with the default installation script (which was run before you knew better)? Look in places you might not immediately think of as a holder of an account: Anything that can be configured remotely can be accessed remotely and normally has a standard account for maintenance purposes. Check routers, printers, copiers, and so on. Printers are well known for providing superuser−level access to UNIX networks. You may wish to run a password−cracking program against your network: Several programs are readily available via reputable sources (SANS lists several in its discussion of this weakness). Use these with care, and with management's official support.
In fact, plan your password policy with management participation. I once interviewed with a company for a networking position; it was revisiting its password policy at that time. The company had had concerns regarding easy and old passwords and therefore implemented a 30−day life on passwords, made them more rigorous, and tolerated no exceptions. Unfortunately, the new policy was implemented for everyone on the same day; 30 days later the CEO ordered the policy scrapped when he couldn't get into his computer.
I accepted another company's offer. #3: Nonexistent or Incomplete Backups
We all know we should make them. We especially remember, with a sick feeling in the pit of our stomach, when something goes wrong and we face the prospect of having lost some of our work. The only way to limit the damage from an attack is to be able to return to a status quo antebellum (literally, for those who can parse a bit of Latin to go with their binary).
It is not enough to merely make backups; you must know how to restore from them (when you desperately need the data is not a good time to be looking at the manual for the first time—if you can find it). Remember the properties of systems: They have emergent (unexpected) behaviors. You need to validate that you can indeed restore usable files, both program and data, from your backups before a crisis occurs. Otherwise, you have just increased the cost of that crisis by several factors, starting with recreating the data as best you collectively can after reloading all the software. The backups should be off−site (that means, at the least, not in the same building or the building next door); it may be inconvenient, but it is safer if the data is at least 10 kilometers away, and this is a case where more is better. The backups, of course, contain the same valuable information as the originals: Protect them physically at least as well as the originals.
#4: Large Numbers of Open Ports
When I run the netstat command on a workstation on a large corporate network, I am not surprised to see a list of 12 to 15 open ports, usually listening. Identifying which ports are really necessary is more difficult. Ports 0 through 1023 are the well−known ports, reserved for privileged and other processes important to a network. Ports 1024 through 49151 are reserved by vendors for their processes. Ports 49152 through 65535 are dynamic and/or private ports.
Because hackers violate your network's privacy, it is not unexpected that they do not necessarily restrict their activities to dynamic ports. In fact, several attacks use signature ports in the registered ports range (for the latest on these ports, see www.iana.org/assignments/port−numbers). You
should periodically scan the open ports on your network hosts. An interesting comparison occurs when you run netstat and then a port scanner against the same host; the results ought to be the same, but a port scanner will often find more ports open than netstat reports. Several port scanners are available via security Web sites; you must scan the entire range (0–65535) for both TCP and UDP.
Before scanning ports, again, have explicit permission (in writing is safest). Some implementations of the TCP/IP stack do not respond well to a scan. Likewise, if you have an intrusion detection system (IDS) on your network, it will notice and trigger an alarm. Depending on your firewall configuration, it, too, may signal an alarm.
Once you realize the number of open ports in your system, you will probably become very interested in closing as many of them as possible. That starts, just as the software installation thought did, with the idea of deciding what you need on this host and installing only that (or allowing only that port to be opened).
#5: Not Filtering for Correct Ingress/Egress Addresses
Ingress traffic is entering your network; egress traffic is leaving it. Why would traffic entering your network have an IP source address inside your network? It shouldn't—there is no reason for your internal traffic to loop outside the network and reenter. Likewise, there is no reason for traffic originating inside your network to have any IP source address other than one associated with your network.
If either of these events is occurring, someone is spoofing and using your network as part of the attack (you may be the honored target or just have one or more zombies—another term for daemons or slaves—on your network). Firewalls and access filters on edge routers (routers at the