JUGAMOS CON LAS POESÍAS
5.3.1. Otras actividades
We need a centralized architecture to aggregate all of the information we collect. For some organizations with only one or two deployed honeypots, this is not a challenge. Data can simply be logged onto the local system and retrieved from there. However, some organization may deploy multiple honeypots in a variety of networks, many of which will be in different geographic locations. For such deployments, we need a way to centrally manage the honeypots and collect all the captured data. One reason for centralized information is that data management becomes much easier. You only have to go to one point to retrieve the data, one point for backups and archiving and one point for data maintenance. This simplifies the entire data capture process. Another reason is that
combining the data from various honeypots can increase the data's value. The collected information can be combined for data mining, statistical modeling, and trends analysis. For example, an attacker may penetrate an organization's internal networks. Numerous honeypots may detect this activity, but it would be difficult to combine and use this information if the data captured from the honeypots is in different locations and potentially in different formats. If all the data is in a central location, the attacker can quickly be identified and potentially tracked down.
One method for a data management architecture is to create a separate honeypot management and logging network, especially for low-interaction honeypots. All the deployed honeypots have a second interface used exclusively for management and logging purposes. This ensures that the data is logged over a secure network to a central location. This same network is then used to also remotely manage all the honeypots. The challenge with adding a management network is that you have to ensure that no security mechanisms are bypassed. If you have honeypots on different networks with different levels of trust, such as an untrusted DMZ network and a trusted internal network, you have to ensure that the management network does not bypass any network access control devices, such as firewalls. If an attacker were to compromise a honeypot on the DMZ network, you have to ensure that the attacker cannot use the honeypot management network to attack internal honeypots, potentially gaining access to the internal network. This would allow the attacker to bypass the production firewall. Figure 12-4 shows an example of a logging architecture deployment using a second firewall on the management network for access control.
Figure 12-4. Dedicated honeypot management network, separated by a firewall. The dotted lines are the separate management network.
In Figure 12-4, we see two honeypots, Honeypot A and Honeypot B. Honeypot A is a high-interaction
production honeypot that mirrors the Web server. Honeypot B is a low-interaction production honeypot, used to detect attackers who have penetrated the firewall. Both honeypots are logging locally and to the remote log server. These logs happen over a second interface on the honeypots, called the management network. We have a firewall that separates the honeypots, so if the high-interaction honeypot is compromised, it has to go through a
firewall to attack either the logging server or other honeypots. This prevents an attacker from gaining access to the DMZ and then bypassing the Internet firewall via the management network. The management firewall allows the honeypots to send information to the log server, but it does not allow the DMZ honeypots to communicate with the honeypots on the internal network. We also have the sniffer system on the same management network, so we can remotely manage it and collect data from the system. Both the low-interaction honeypot and the sniffer are on the same management network, since they are both low-risk systems.
Whatever architecture you choose, make sure that honeypots on different networks of trust are also segmented on the management network.
For the log server, you will want to have some type of functionality where all the honeypot logs can be centrally stored and retrieved. The challenge here is that you may have different honeypots with different data capture capabilities in different formats. You need some method of collecting divergent data types. One of the best ways to approach this is a database system.
The database has a variety of tables that can handle different data and logging types, such as the ability to store logs generated by syslogd, commercial honeypots, and firewall logs. How you implement this depends on your organization, requirements, and data types. There are several OpenSource solutions that give you this
functionality; two examples are ACID [2] and Demarc [3]. There are also commercial solutions, such as
NetForensics [4]. Whatever solution you choose, make sure it has the flexibility to work with different data types and can be used to query the collected data.
Using NAT
Network Address Translation is a tool we can use when deploying our honeypots. NAT, as it is commonly called, is a functionality usually implemented at network routers or firewalls. The purpose of NAT is to translate one IP or port address to another. Before we look at the role of NAT in optimizing honeypots, you need to understand a bit about why NAT exists and how it works.