Acero de Refuerzo - Bases Teóricas

CAPÍTULO II: MARCO TEÓRICO

2.2 Bases Teóricas

2.2.4 Acero de Refuerzo

Following a review of the three principles of adaptive honeynets, this section identifies requirements for adaptive honeynets for attribution. The requirements are mostly unique to the design of adaptive honeypots and differentiate from the typical design of static honeypots. When there is overlap the requirements are usually heightened by adaptive honeypots.

Observation and adaptation transparency

The observation technique should be transparent to both the adversary and the system that is being monitored. This requirement reduces the risk of the honeypot being detected, subverted or disabled by an adversary. This is best achieved by not making source code changes to the underlying system i.e. kernel, or programs i.e. software binaries. The selection of an appropriate observation technique underpins this requirement. Similarly, adaptation should be equally transparent and, again, this is underpinned by selecting an appropriate adaptation technique.

Two observation techniques reviewed earlier in this chapter best achieve this requirement: VMI and network monitoring. Network monitoring, an external technique, does not offer the granularity that VMI offers, since it can only capture network traffic, which may be encrypted. VMI is therefore the most suitable candidate to meet this requirement. However, network monitoring can be used in combination with VMI and is the best outcome.

Observation integrity

The observation technique should offer high integrity in the data that is collected. If an observation technique can be detected, subverted or disabled then observations could be falsified by an adversary and adaptations could be based on false observations. An adversary could ultimately game the system. This requirement is best met by selecting observation techniques that are least likely to be subject to detection, subverting or disabling. VMI is the best candidate for this task, while a better solution is to combine VMI with network monitoring.

Observation and adaptation diversity

An observation technique must be able to collect a wide variety of data that meets operator requirements. For example, adversary keystrokes, running processes and active network connections. The observation technique should also be scalable to observe a variety of operating systems and distribution flavours. For example, an observation technique that can only observe Linux keystrokes is inherently limited. Similarly an adaptation technique should be able to adapt various operating system environments.

Observation and adaptation performance

The observation technique must not be detrimental to system performance and ideally should be minimal so as not to be detected by an adversary using timing-based detection techniques. This also applies to adaptation techniques.

Accessible interpretation

The framework should provide hook points for interpreting techniques. Observation data should be easily accessible, i.e. stored in a common format such as a database. There should be an accessible hook point to queue adaptations. This offers a full feedback loop for interpreting techniques, for example machine learning techniques such as reinforcement learning (RL), supervised learning (SL) or case-based reasoning (CBR). Figure 4.7 shows typical reinforcement learning components and interactions (pybrain.org, 2014).

An agent collects observations of the environment, and then performs actions on the environment, which are determined by rewards. When applied to an adaptive honeynet framework, observations are collected, the agent is the interpretation engine and the actions are adaptations that take place on the environment, i.e. the honeynet. Hook points provide access to observations and a mechanism to implement actions, while rewards are calculated by the machine learning technique.

Similarly the framework should be programmed in an accessible programming language that allows interpretation techniques to connect freely. Closed-source approaches are less useful in this respect, while open source, high level languages such as Python are preferred.

Actor-dependent adaptations

The immersive principles, discussed earlier, are achieved in honeypots by actor-dependent trigger adaptations, rather than system or environment-based adaptations. The technique should learn

Figure 4.7: Reinforcement learning components (pybrain.org, 2014)

about how and why the adversary wishes to compromise the environment, understand their skill level and then adapt accordingly.

Available adaptations

A framework should provide adequate adaptations that are appropriate to the environment space. Appropriate parts of the environment must be available for adaptation techniques to be able to change. Similarly, adaptations should be applicable to multiple operating systems. Previous research has experimented with five adaptations to interactive command-line input (Wagener et al., 2009):

• Success; no adaptation is made • Failure; simulated failure

• Substitute; a response is substituted with another valid response

• Lag; the result is returned to the adversary in a different time period than is usual • Insult; the adversary is insulted to attempt to determine their dialect and/or locale This research proposes and focuses on an additional adaptation:

• Modify; the environment is modified. Modifications can be persistent or volatile. Persistent includes creating users, folders, directories, decoys, etc. Volatile all items that may class as volatile data, includes processes, sessions, etc.

A combination of the above is also possible. However, certain combinations cannot be made, i.e. success, failure and substitute should always be separate, but may be combined with lag and modify. Also this research acknowledges that adaptations do not need to immediately take place, as was the case with previous research. Instead the environment can be modified slowly over time.

Queue, priority, discard and modify

It should be assumed that in a honeynet containing multiple honeypots, there will be many adaptations. A queuing system should allow sequential invoking of adaptations. It must also be assumed that in real-time some adaptations may be considered more important than others. Therefore, a priority system should allow urgent adaptations to join the front of the queue. For example, the adapting queue currently holds five adaptations that are awaiting execution in the environment. The interpretation phase created these adaptations based on observations that are loosely linked to an implied goal with some confidence. Seconds later, the interpretation phase processes a new set of observations, but this time there is a much stronger link between the observations and a different implied goal. It must also be assumed that adaptations can be discarded, if they achieve the same goal or render other adaptations as implausible or redundant. Adaptations must be compared to those currently in the queue for similarities and contradictions.

Measurement

The success or failure of the adaptive cycle must be measurable. This is trivially achieved by recording the duration that the adversary stayed on the system, the quantity of interactions and the diversity of interactions. The quantity of sessions could also be measured if it is possible to accurately identify when an adversary returns.

There are other ways to measure success. For example, the possible paths of the honeypot could be plotted and updated as adaptations are executed. Success is determined by the particular paths that the adversary explores, i.e. ideally the adversary would explore all new paths created by adaptive features. This could be, for example, visualised as an overhead perspective of a maze or a heat map, with explored paths highlighted.

Reset to initial state

It must be assumed that the system will be visited by multiple adversaries over a period of time. The system must be able to reset to a default or given state so that it can be re-used. This process should be as simple as possible and preferably automated to reduce operator burden.

Legal and ethical implications

Legal and ethical implications are a consideration that is typical in the design of static honeypots and is by inheritance applicable to the design of adaptive honeynets for attribution. This requirement is magnified by the properties of adaptive honeynets and warrants further consideration. Particularly important is when active-defence has been proposed as part of an adaptive honeypot, i.e. honeypots that can automatically respond by attacking back. The active response continuum offers advice here (Dittrich and Himma, 2005). Honeypots that attack back, known as offensive honeypots, rank highly on the continuum. For this reason they should not be used in the design of adaptive honeynets.

In document UNIVERSIDAD RICARDO PALMA FACULTAD DE INGENIERÍA (página 37-0)