• No se han encontrado resultados

Simulación del sistema de transporte vehicular

CAPITULO II: DISEÑO E IMPLEMENTACIÓN DEL SISTEMA DE GESTIÓN DE

2.9 Simulación del sistema de transporte vehicular

Results on the synthetic data sets are presented in Figures 3.10–3.13, and results on the real-world data sets are shown in Figures 3.16–3.17. The dashed line indicates the result for model M1, black solid circles represent results for model M2, and other symbols correspond to results for model M3—the common key of all plots is shown in Fig. 3.9. In total, the results represent 32805 learning runs. Error bars are ommited in favor of readability. Instead, we provide the empirical worst-case variances in Table 3.4. While variances of mean squared error and runtime-per-training-iteration are small, the variance of the number-of-non-zero-ratio (parameter density) sticks out. This is not a surprise, since the achievable sparsity depends on the actual parameters, and we compute the variances over random model parameters. The MSE and the runtime do indeed not depend on the actual model parameters and have hence much lower variances.

We explain and discuss the results w.r.t. to the questions stated at the beginning of this Section.

Q1 Can we observe an increase in sparsity?

Converting redundancy into sparsity is the main motivation for our regularized reparame- trization approach. First, we investigate the response of M1–M3 to an increase in the regularization weight λ. Corresponding results can be found in the NNZ-Ratio plots of Figures 3.10–3.17. And indeed, increasing λ leads to a lower proportion of non-zero

parameters, and thus an increased sparsity. In all cases, M2 and M3 models exhibit a lower NNZ-ratio than the unregularized baseline MRF M1. Moreover, the sparsity of regularized models increases at the same rate.

Inverse decay types take more information from previous time steps into account. One could hence expect that they may lead to an increased sparsity, compared to their regular counterparts. But taking a closer look at the different decay types reveals, that there is no strict order between inverse and non-inverse decay types. All sparsity levels are close to each other, with the regular exponential decay leading to the highest sparsity, followed by the regular rational decay. In most cases, the M2 models deliver a lower sparsity than our proposed reparametrized models. This behavior is stable over all graphical structures that we considered in our experiments.

Similar observations can be made when we consider sparsity as a function of the model’s redundancy (Fig. 3.11). The rate at which the sparsity of M2 and M3 models increases is now different. M2 models are almost rigid against an increased redundancy. At very high redundancy, the NNZ-Ratio drops at least for star and full structures, and stays at the same level on chain and grid structures. Instead, the average sparsity of M3 models increases with increasing sparsity and exhibits a large increase in regimes of high redundancy on all graphical structures.

Lastly, we investigate how sparsity behaves as a function of the model’s depth T (Fig. 3.12). For a low number of time steps (T < 64), the sparsity of models with regular linear, quadratic, and cubic decay is lower than that of the M2 model and models with inverse decay. However, this changes when the depth is increased. In all experiments, the difference in sparsity between M2 and M3 models (with regular decay) diverged. This is a strong evidence for the intuition that deep spatio-temporal models of type M2 become more complex with increasing depth, while M3 models tend to become less complex (easier) when reparametrizations with regular decay types are used.

Altogether, we can conclude that an increase in sparsity can be observed for both, M2 and M3 models. Regular rational and exponential decays deliver the highest sparsity. But being sparse alone is not enough, sparse models should still provide a reasonable quality.

Q2 Do compressed models still provide a reasonable quality?

We do now investigate the quality of sparse models. To this end, we compare the mean squared error ∥ ˜µ − ˆµ∥2

2/d between the empirical marginals ˜µ and the model’s estimate ˆµ

for M1, M2 and M3 models, estimated via 1000 iterations of loopy belief propagation. Results for M2 and M3 models can be found in the MSE plots of Figures 3.10– 3.17). Treating the MSE as a function of λ (Fig. 3.10,3.15–3.17), it becomes evident that sparse M2 models exhibit a considerably larger MSE than almost all M3 models. The only exception constitute models with regular exponential decay. This confirms our intuition which we used to design the reparametrization. Strong regularization leads to very sparse M2 models, but destroys the underlying conditional independence structure, while sparse M3 models keep this structure intact. The plot also reveals that the decay types which provided the highest sparsity, namely rational and exponential decay, have

also the largest MSE. However, the MSE of regular rational models is still below the MSE of models with plain l1-regularization. It is also interesting that the uncompressed

M1 models provide a superior low MSE of ≈ 10−10 on synthetic data (the dashed line is so close to 0 that it cannot be seen in the plots), but on all real-world data sets, their error has the same order of magnitude than that of M2 and M3 models. This could indicate that the parameters of the model that generated the real-world data follow a substantially different distribution than the one that we used to generate the synthetic data.

Regarding the redundancy that we injected into our synthetic data, we can see from Fig. 3.11 that the MSE is almost constant w.r.t. the redundancy. If anything, then a slight decreasing trend can be observed for all models types. The same applies to the MSE as a function of the models depth (Fig. 3.12). Again, models with exponential decay and rational decay show a larger error than the other decay types.

All in all, M2 models trade sparsity against quality, while our proposed M3 mod- els, especially those with rational decay, are capable of being sparse and achieving a reasonable small error at the same time.

Q3 How does the spatio-temporal reparametrization influence the computational complexity?

Since most things are not for free, we may reckon that some kind of tradeoff exists. We hence investigate the empirical runtime of the models. More precisely, we consider the average runtime (in milliseconds) per training iteration as shown in Figures 3.13– 3.17. Results on MSE and NNZ-ratio are very close for different decay types. This changes when we consider the runtime. The experimental results on the synthetic data (Figures 3.13 and 3.14) reveal an ordering of decay and model types. In fact, M2 models have the shortest runtime per iteration, followed by the M1 model, followed by M3 models with regular decay, followed by M3 models with inverse decay. In some cases, the runtime difference between sparse M3 models with rational or exponential decay and M1 is below one millisecond.

This result is intuitive if one recalls what kind of computation has to be performed for each model. The asymptotic complexity of M1 and M2 models is equivalent, and since arithmetic which involves many zeros can be carried out faster, sparse M2 models are faster than their dense M1 counterparts. On the other hand, M3 models have to “decode” the natural exponential family parameters at runtime and hence suffer from a computational overhead. Regular decay types assign the coefficient ≈ 0 to parameters which are far from the parameter at time t. Hence, decoding their natural parameters enjoys the same effects which make M2 models faster than M1 models. Inverse decay types assign non-zero weights to almost all preceding parameters and are hence inherently slower than reparametrizations with regular decay. This behavior can be observed in all experiments on synthetic data, no matter if we treat the runtime as a function of the regularization weight, as a function of the redundancy, or as a function of the models depth.

But in some cases, the order is not preserved, e.g., models with regular decay can sometimes be slower than models with inverse decay. This can be explained with the rather shallow depth of our real-world models (T = 12), since a higher depth implies a larger performance penalty for inverse decay types.

More surprising is that in some cases, M3 models are actually faster than M1 and even M2 models on the real-world data. This signifies that the parameters learned by M3 models may ease the computation of loopy belief propagation in some undiscovered way—at least on our real-world data. However, we cannot preclude the possibility that this behavior is triggered by imperfections in the real-world data which are not present in the synthetic data.

To sum up, the theoretical runtime penalty of M3 models can be observed in our experiments. In many cases, this penalty amounts to a few milliseconds, and in some other cases, the penalty is not existing at all. Our reparametrized models are thus a practical alternative to classic undirected models.

Documento similar