Capítulo 4. Conclusiones y recomendaciones
4.2. Conclusiones y recomendaciones generales
Ghosh et al. [66] formulate verification of functional requirements of dis- tributed automotive control system as a planning problem. The verification problems can be thought of as examples of pseudo-adversarial problems, in which the planning agent is the environment and the adversary is the con- trol system – acting according to its specification. The overall purpose of controller verification is to find bugs in the controller.
The environment disturbs the state of the system, while the controller re- turns the system to one among many safe states. If the control is distributed, then the controller’s move typically consists of an orchestrated set of actions across the system components. The components’ actions may execute in different sequences due to non-determinism arising out of task scheduling and communication latencies between the components. Ghosh et al. note that it is important to guarantee that the control is correct in spite of this non-determinism. For this reason, they allow the environment to exploit this non-determinism in the controller by choosing the order of applicable control actions. The environment may only take actions when the controller is in a control stable state, meaning that no control action is applicable (an envi- ronment action may change the state to one that is not control stable). This reflects the fact that the control action application is faster than the pace at which the events in the environment occur. Whenever the environment changes the state of the system, the controller executes one or more appli- cable control actions to return the system to a (possibly different) control
Figure 5.3: Sokoban. From https://en.wikipedia.org/wiki/Sokoban# /media/File:Sokoban_ani.gif
5.4. PSEUDO-ADVERSARIAL DOMAINS 105
10
8
7
9
6
5
4
3
2
1
0
exit
exit
exit
exit
stable state. Acontrol fail is a state in which some of the safety requirements are violated. If the control fail state is reached, it demonstrates the failure of the controller specifications (i.e. controller specifications need to be changed as there is a bug).
The problem can be expressed as a classical planning problem, using STRIPS. The goal of the problem is to reach any of the control fail states – in other words, the existence of a plan for the problem is a counter-example for the safety requirement which is violated in that control fail state. The actions are divided into two disjoint sets, environment actions Aenv and controller
actionsActl. A Boolean variableenais added for each actionawhich indicates
thatamay be applicable. A set of conditions{ena =false|a∈Actl}is added
to the precondition of every environment action (that is, the environment takes actions only when no controller action is applicable) and to the goal (no controller action is applicable in the control fail state). A “disabling” action dl for each literal l that appears in the precondition of some control
action a, with pre(dl) = {l = false} and eff(dl) = {ena = false|l ∈pre(a)},
is used to mark control actions inapplicable.
Each (environment and control) action that potentially contributes to making pre(a) true (i.e. that sets any variable to a value required by pre(a)) sets ena=true, so the plan must include disabling actions before each envi-
ronment action to verify its applicability. Because the compiled problems are hard for the planners they tried, they also propose an incremental, partial compilation coupled with a plan repair approach.
However, the requirement that no control action is applicable in a control stable state can be easily formulated using axioms:
stable ← {ena =false|a∈Actl}
ena ← pre(a)
Although these rules mirror almost exactly the actions in the compilation by Ghosh et al., making them axioms instead of disabling actions removes the choice of when and which disabling actions to apply from the planner, resulting in a smaller state space and shorter plans. This is similar to the effect that using axioms has in the Sokoban domain – removing unnecessary choices makes the problem easier. The effects of this will be discussed in Section 5.7.2.
Examples
Ghosh et al. present two examples from automotive industry – a car door lock and an adaptive cruise control (ACC) system. We will use their PDDL
5.4. PSEUDO-ADVERSARIAL DOMAINS 107
encodings of these two domains in our experiments (see Section 5.7 for ex- periments and the Appendix B for the domains). We will describe the door lock system as a simple example. The system is supposed to ensure that all doors are locked when a car attains a pre-calibrated speed. In the example problem, the control system’s actions are: (i) if the car switches from low- speed state to high-speed state, the system arms the auto lock procedure; (ii) if the auto-lock procedure is armed and all doors are closed, the the system automatically locks all doors and disarms; (iii) if the remote unlock com- mand is detected, and the doors are locked the system arms the auto-unlock procedure and (iv) if the doors are locked and the auto-unlock procedure is armed, the system unlocks all locked doors and disarms the auto-unlock. The environment’s (i.e.the driver’s) actions include opening and closing the doors, putting the key into ignition, running the engine, accelerating the car and issuing a remote unlock command. The environment can achieve the goal (i.e. violate the safety requirements) by getting the car into the high speed state and arming the auto-unlock. This shows that the system’s action (iii) should have it not-moving-at-high-speed added to its preconditions. With preconditions rewritten this way, the environment’s goals are not achievable. The ACC system is a driver assistance feature designed to automatically adjust the vehicle’s speed. It operates in two modes: speed control mode, in which it maintains the vehicle’s speedvcar at some chosen vref, and time gap
mode, in which it maintains a safe distance (or time gap) between the vehicle and any other nearby vehicles. In the speed control mode the controller’s actions are: (i) accelerate if vcar < vref; (ii) decelerate if vcar > vref; (iii)
maintain constant speed ifvcar =vref and (iv) switch to the time gap mode if
there are other vehicles within the predetermined time gapt2. In the time gap
mode the controller’s actions are: (i) accelerate if there is a vehicle directly in front of the car within the time gap t2 that are going faster than vcar and
vcar < vref; (ii) decelerate if there are vehicles within the time gap t2 that are
slower than vcar; (iii) maintain speed if the vehicle directly in front is going
at the same speed as thevcar; (iv) switch to speed control mode if there are
no vehicles within t2 and (v) if there are any vehicles within the pre-fixed
time gap t1, the driver is warned with an audible signal. The environment
actions include the driver’s actions and the actions of other vehicles (such as switching lanes, accelerating or decelerating). The safety requirements are: (i) if the ACC is engaged and the warning signal is on, the control is also applying negative acceleration and (ii) if the ACC is engaged and there is a vehicle within thet2 time gap whose speed is slower thanvcar, then vcar must
not be increasing.
The authors performed the experiments on the ACC domain problems us- ing a number of model checkers (NuSMV [26] and SPIN [86]), SAT planners
and heuristic state-space planners with and without support for derived pred- icates. They report that, among the tested approaches, the model checker SPIN, and the planners Mp [122], Fast Downward [77] and LAMA [121] show most promise that they would scale to larger problems. The planners, how- ever, generate much shorter plans than SPIN, which makes debugging easier (unlike us, they do not perform optimal planning, so the solutions can be of varying lengths).