The analysis and the numerical results presented in this thesis refer to a single dataset, that consists of several different groups of problems. This dataset is made up of 341 instances, divided in 180 instances for the training set and 161 for the testing set. These instances have been collected using the following datasets: miplib2010 [47], fc [48], lotSizing [49], mik [50], nexp [51], region [52], and pmedcapv, airland, genAs- signment, scp, SSCFLP were originally downloaded from [53]. Each of the selected instances can be solved by at least one of the available pure heuristics in less than the 1,800 seconds of time-out. Furthermore, the dataset doesn’t include instances that
1 2 3 4 5 6 7 8 9 10 11 12 Cluster 1 20 - - 25 - 25 - 30 - - - - Cluster 2 - 45 - - - 14 41 - - - - - Cluster 3 - - - - 1 5 - - 18 62 14 - Cluster 4 - - 49 - - 7 - - - 44 Cluster 5 - - - - 98 2 - - - -
Table 5.1: Given the clusterisation obtained using g-means and the whole instances dataset, the table shows the distribution (percentage) of the instances in the clusters. The problem types are: 1: airland, 2: fc, 3:GenAssignment, 4: LotSizing, 5: mik, 6: miplib2010, 7: nexp, 8: pmedcap, 9: region100, 10: region200, 11: scp, 12: SSCFLP
can be solved too easily; therefore:
• each instance will not be solved completely in the preprocessing phase;
• an instance solving time will be greater than 1s for at least one of the available heuristics.
Figure 5.1 shows a bidimensional projection of the training set. The latter is clustered using g-means, and the obtained distribution of instances per cluster is shown in Table 5.1. Each row is normalised to sum to 100%. Thus for Cluster 1, 20% of the instances are from the airland dataset. From this table, a first observation is that there are not enough clusters to perfectly separate the different datasets into unique groups. However, this is not a problem as it is not the desired result. Instead the focus is in capturing similarities between instances, not splitting benchmarks. Looking at Table 5.1, the region100 and region200 instances are grouped together. Furthermore, Cluster 4 logically groups the LotSizing and the SSCFLP instances together. Finally, the instances from the miplib, those instances that are supposed to be an overview of all problem types, are spread across all clusters. This clustering therefore demonstrates that the dataset both has a diverse set of instances and that the employed features are representative enough to automatically notice interesting groupings.
CHAPTER 5. EXPERIMENTAL SETUP
(a) Dataset: extDatasetC
(b) Dataset: extDatasetS (c) Dataset: extDatasetSr
Figure 5.2: Clustering obtained using g-means on, respectively, extDatasetC, ext- DatasetS, extDatasetSr. These datasets contain all the training instances and a collec- tion of their subinstances, obtained using CPLEX or SCIP, as described in Section 5.4. The figures show also the cluster centres and the cluster identification numbers.
Using the procedure described in Section 4.3, given a set of branching heuristics and the instances dataset, it is possible to collect a sample of subinstances and to build an extended dataset. In particular, this thesis employs three extended datasets:
• extDatasetC : this extended dataset consists in all the original training instances, together with a subsample of subinstances collected using CPLEX and all the 8 branching heuristics (6 pure heuristics and 2 random switched) implemented in the branch callback, for a total of about 16,000 feature vectors.
• extDatasetS : it contains the original training instances and a subsample of subin- stances collected using SCIP and all the 8 available branching rules, for a total of about 45,000 feature vectors.
• extDatasetSr : similar to the previous one, with the only difference of having used just 5 heuristics for collecting the subinstances. In particular the heuristics removed are the 2 that performed best on the training set, using the geomet- ric mean as performance measure: reliability branching (RP) and pseudo cost branching (P). Moreover, the random variable branching rule has also been removed. The dataset obtained is of about 30,000 feature vectors.
Finally the three datasets have been reduced to 10,000 feature vectors each, be- cause of memory limitations that the chosen implementation of g-means doesn’t cover. The subsampling performed is mainly a random sampling operation with the con- straint, given an instance, to keep the feature vectors at depth 0 and 1 and to keep, if possible, at least one other element for each branching rule applied.
The reason behind having distinct extended datasets relies on the fact that the two DASH implementations for CPLEX and SCIP use a different set of heuristics. Therefore, the behaviour of an instance during the solving process is probably dif- ferent. In particular, the distribution of subinstances could change in such a way as to result in a different clusterisation. As it is shown in Figure 5.2, the SCIP and the CPLEX versions have completely different clusterisations. On the contrary, it is clear that the two groupings obtained for the SCIP versions are very similar. How- ever, small difference in the clustering could cause significant variations in the overall
CHAPTER 5. EXPERIMENTAL SETUP
performance. In fact just turning off 3 heuristics, 2 small clusters are not identified anymore and the overall structure slightly changes, and this could lead to totally different path in the solving tree.
Next, the three obtained clusterisations are studied in order to obtain information about the feature space. In particular, the features are ranked using the information gain ratio [54, 55], obtaining a possible features importance index. From this analysis, the most important variables seem to be the statistics on the number of constraints and of the number of variables in each constraint (in Section 4.1 these are indicated as features 10..13, 26..29, 30..33). On the contrary, features specific to continuous, integer, or binary variables seem to be less important. An interesting result regards the feature depth, that turns out to be one of the least important. Assuming that the information gain ratio is a good choice for ranking the features in this case, this result means that it is not important at which depth a subproblem is, but the similarity with the other instances can be measured just using the other features, that are related on the subproblem structure. In fact, it is reasonable to think that two distinct solving processes, on the same original problem, could arrive sometimes at the same subproblem pr, but at different depths. The conclusion is that pr should
be solved in the same way in both cases, without considering the difference in depth.
5.5
Chapter Summary
This chapter presented the implementation-specific details of DASH and introduced the technology and measurements employed for the experiments. The analysis and results have been performed on a heterogeneous dataset. Because of the different magnitude of the results on this dataset and because of the presence of a time-out value for the solvers, each of the chosen measurements for the comparisons puts the focus on a distinct aspect of the analysis. Finally, this chapter described also how the heuristics employed in the experiments are implemented, and it defined the distinct portfolios used.