CAPÍTULO 5: PROPUESTA DE MEJORA
5.2. Propuesta de tratamiento de riesgos disergonómicos
So far, we showed that if there exists a solution (T,D) of OERTMinMax(ΓBR) such that T is
regionally simple and D is regionally constant, then(Te,De)is a solution of OERTMinMax(Γ). In this section, using a strategy improvement algorithm, we give a constructive proof of the fact that for every boundary region automaton TBR there exists a solution (T,D) |=
OERTMinMax(ΓBR)such thatTis regionally simple andDis regionally constant.
We begin by defining a special class of strategies in boundary region automata which we call regionally constant positional strategies. These strategies are of interest as all strategies appearing in our strategy improvement algorithm are regionally constant positional strategies. In Subsection 5.4.2 we define optimality equationsOERT
Max(ΓBR) and
OERT
Min(ΓBR) characterising maximum and minimum, respectively, reachability-price in a
boundary region automata. In Subsection 5.4.3 we present strategy improvement algorithm to solveOERTMax(ΓBR), which is called as a subroutine in the strategy improvement algorithm
presented in Subsection 5.4.4 to solveOERT
MinMax(ΓBR).
5.4.1. Positional Strategies in Boundary Region Automata
A positional strategy for player Max in a boundary region automaton ΓBR is a function χ : SMax → M, such that for every s ∈ SMax, we haveχ(s) = ([s],α,R), for some α ∈ A
andR ∈ R. A strategyχ : SMax → M isregionally constantif for alls,s0 ∈ SMax, we have
that[s] = [s0]impliesχ(s) = χ(s0); we can then write χ([s])forχ(s). Positional strategies for player Min are defined analogously. We write∆Max and∆Minfor the sets of positional
strategies for players Max and Min, respectively.
If χ ∈ ∆Max is regionally constant then we define the strategy subgraphΓBRχ to be
the subgraph(R,Mχ)where Mχ ⊆ Mconsists of: all moves(R,α,R0) ∈ M, such that
R∈ RMin; and of all movesm= (R,α,R0), such thatR∈ RMaxandχ(R) =m. The strategy
subgraph ΓBRµfor a regionally constant positional strategy µ ∈ ∆Min for player Min is
We say thatR ∈ Rischoicelessin a boundary region automatonΓBR ifRhas a unique
successor inΓBR. We say thatΓBR is 0-player if allR∈ Rare choiceless inΓBR; we say that
ΓBR is 1-player if either allR ∈ RMinor allR ∈ RMaxare choiceless inΓBR; every boundary
region automatonΓBR is 2-player. Note that ifχandµare positional strategies in ΓBR for
players Max and Min, respectively, then ΓBRχ and ΓBRµare 1-player and (ΓBRχ)µ is
0-player.
For functions T : R → [S → R] and D : R → [S → R], and s ∈ SMax, we
define sets M∗(s,(T,D))and M∗(s,(T,D)), respectively, of moves enabled in s which are (lexicographically)(T,D)-optimal for player Max and Min, respectively:
M∗(s,(T,D)) = argmaxlex m∈M T(R0)⊕α(s),D(R0)α(s) : m= ([s],α,R0) , and M∗(s,(T,D)) = argminlex m∈M T(R0)⊕α(s),D(R0)α(s) : m= ([s],α,R0) .
LetChoose : 2M → Mbe a function such that for every non-empty set of movesM ⊆ M, we have Choose(M) ∈ M. For regional functions T : R → [S + R] and D : R →
[S+N], the canonical(T,D)-optimal strategiesχ(T,D)andµ(T,D)for player Max and Min, respectively, are defined by: χ(T,D)(s) = Choose(M∗(s,(T,D))), for everys ∈ SMax; and µ(T,D)(s) =Choose(M∗(s,(T,D))), for everys ∈SMin.
5.4.2. Optimality Equations
OE
RTMax(
ΓBR)
,OE
MaxRT(
ΓBR)
, andOE
RT(
ΓBR)
Optimality equations sets OERT
Max(ΓBR), OERTMax(ΓBR), and OERT(ΓBR) are introduced.
Intuitively,OERTMax(ΓBR)(OERTMax(ΓBR)) characterises optimal reachability-price when player
Min (Max) is choiceless, while OERT(ΓBR) characterises optimal reachability-price when
both players are choiceless. We also define the sets OERT
≥ (ΓBR) and OERT≤ (ΓBR) that
are useful technical tools in showing strict improvement in every iteration of strategy improvement algorithm.
LetT :R →[S →R]andD:R →[S →N]. We write(T,D)|= OERT
Max(ΓBR)if for all s∈ Swe have the following:
e T(s),De(s) = ( 0, 0 ifs ∈F maxlex m∈M T(R0)⊕α(s),D(R0)α(s) : m= ([s],α,R0) , otherwise. Similarly we write(T,D)|=OERT
Min(ΓBR), if for alls∈ F, we have the following:
e T(s),De(s) = ( 0, 0 ifs ∈F minlexm∈M T(R0)⊕α(s),D(R0)α(s) : m= ([s],α,R0) , otherwise.
If ΓBR is 0-player then OERTMax(ΓBR) and OERTMin(ΓBR) are equivalent to each other and
denoted byOERT(ΓBR).
We write(T,D)|=OERT
e T(s),De(s) ≥lex ( 0, 0 ifs ∈F maxlexm∈M T(R0)⊕α(s),D(R0)α(s) : m= ([s],α,R0) , otherwise. Similarly we write(T,D)|=OERT
≤ (ΓBR)if for alls ∈S, we have:
e T(s),De(s) ≤lex ( 0, 0 ifs ∈F minlexm∈M T(R0)⊕α(s),D(R0)α(s) : m= ([s],α,R0) , otherwise.
PROPOSITION 5.4.1 (Relaxations of optimality equations). If (T,D) |= OERT
Max(ΓBR) then (T,D)|=OERT
≥ (ΓBR), and if(T,D)|=OERTMin(ΓBR)then(T,D)|=OERT≤ (ΓBR).
LEMMA5.4.2(Solution ofOERT(ΓBR)is regionally simple). LetΓBRbe a 0-player boundary
region automaton. If(T,D) |= OERT(Γ
BR)then Tis regionally simple and Dis regionally
constant.
PROOF. In a 0-player boundary region automatonΓBR, for every regionR, there is at most
one outgoing labelled edge(R,α,R0)∈ M, and hence for every region R, there is a unique M-path fromRin ΓBR. For every regionR ∈ R, we define the distanced(R) ∈ N to be
the smallest number of edges in the uniqueM-path fromR, that one needs to reach a final region. It is easy to show that for every states ∈ S, we have thatD([s])(s) = d([s]), and henceDis regionally constant.
We prove that for every region R ∈ R, the function T(R) : R → R is simple, by induction ond(R). Ifd(R) =0 thenT(R)(s) =0 for alls∈ R, and henceT(R)is simple on
R.
Let d(R) = n+1 and let (R,α,R0) ∈ M be the unique edge going out of Rin ΓBR.
Observe that T(R) = T(R0)⊕α because for every s ∈ R, we have T(R)(s) = T([s])(s) = T(R0)⊕α(s), where the second equality follows from(T,D)|=OERT(Γ
BR). Moreover, by the
induction hypothesis the functionT(R0): R0 →Ris simple, and hence by Proposition 5.3.1 we get thatT(R0)⊕α =T(R)is simple.
Ifd(R) =∞, i.e., if the uniqueM-path fromRinΓBRnever reaches a final region, then
we setT(R0)(s) = ∞, for alls ∈ R. Therefore T(R0) : R → R is a constant function and hence it is simple.
5.4.3. Solving 1-Player Reachability-Time Optimality Equations
OE
RTMax(
ΓBR)
In this section we give a strategy improvement algorithm for solving maximum reachability- time optimality equationsOERTMax(ΓBR)for a 1-player boundary region automatonΓBR.
We define the following strategy improvement operator ImproveMax:
ImproveMax(χ,(T,D))(s) =
(
χ(s) ifχ(s)∈ M∗(s,(T,D)),
Note that ImproveMax(χ,(T,D))(s) may differ from the canonical (T,D)-optimal choice
χ(T,D)(s)only ifχ(s)is itself(T,D)-optimal in states, i.e., ifχ(s)∈ M∗(s,(T,D)).
LEMMA 5.4.3 (Improvement preserves regional constancy of strategies). If χ ∈ ∆Max is
regionally constant, T : R → [S → R] is regionally simple, and D : R → [S → N] is regionally constant, then ImproveMax(χ,(T,D))is regionally constant.
PROOF. We need to prove that for s,s0 ∈ S, if[s] = [s0]then χ0(s) = χ0(s0), where χ0 =
ImproveMax(χ,(T,D)). By regionality of χ it is sufficient to prove that M∗(s,(T,D)) = M∗(s0,(T,D)). By regional simplicity ofT, and by Proposition 5.3.1, we have that functions
T(R)⊕α :[s]→R, for allm= ([s],α,R)∈ M, are simple. Then we have M∗(s,(T,D)) = argmaxlex m∈M T(R)⊕α(s),D(R)α(s) : m= ([s],α,R) = argmaxlex m∈M T(R)⊕α(s0),D(R)α(s0) : m= ([s0],α,R) = M∗(s0,(T,D)),
where the second equality follows from [s] = [s0], regional constancy of D, and by the application of Lemma 5.2.2 to the (finite) set of functions{T(R)⊕α : ([s],α,R)∈ M}.
Input: Boundary Region AutomatonΓBR
Output: A solution ofOERT
Max(ΓBR)
begin 1
(Initialisation). Choose a regionally constant positional strategyχ0 ∈∆Maxfor
2