14-JALISCO 1-PÚBLICA - Catálogo de instituciones v. 3.20

Experiments conducted in the real world can never be perfect. As in any other empirical study, our results face threats to validity. In the following, we discuss these threats: first with respect to variability models in closed platforms, and second to our analysis and comparison of open platforms and their ecosystems.

7.3.1. Software Product Lines

Threats to external validity. The main threat to the external validity of our findings

is that they are based on only two languages and a limited set of models. On the other hand, most are large, independently developed real-world projects, with different objectives, ranging from Linux as a general purpose kernel, over configurable system software tools, to eCos as an entire specialized real-time operating system for embedded devices. We believe that other related domains, especially embedded real-time such as automotive and avionic control software, will share many characteristics with the studied systems. Further, comparison to other feature modeling languages, shows that both are representative of the space of feature modeling.

Furthermore, we only look at the available artifacts: the languages, manuals, models, and mailing lists. We have not interviewed developers and users. We currently perform such interviews (see Section 7.4). In this dissertation, however, our confidence is based on formalizing the language concepts and on exhaustively testing the configurators and build systems with hand-crafted examples.

For Linux and eCos, we only examined one architecture each; however, both architec- tures represent large and mature portions of the systems: Linux’s x86 architecture covers 61% of the total of 10415 features and 67% of the total of 8M SLOC; the eCos’ i386PC covers 44% of the total of 2859 features and 33% of the total of 0.9M SLOC.

Threats to internal validity. A threat to the internal validity is that our statistics

are incorrect. To reduce this risk, we instrumented the native tools to export models in our own format rather than building our own parsers. We thoroughly tested our analysis infrastructure using synthetic test cases and cross-checked overlapping statistics. We tested our formal semantics specification against the native configurators and cross- reviewed the specifications. We used the Boolean abstraction of the semantics to translate both models into Boolean formulas and run a SAT solver on them to find dead (always inactive) features. We found 114 dead features in Linux and 28 in eCos. We manually confirmed that all of them are indeed dead, either because they depend on features from another architecture or were intentionally deactivated. The other models mostly have no (axTLS, BusyBox, Fiasco, uClinux-dist), or just a few (four features in Freetz, Toybox, and uClinux-base) dead features. Only Buildroot (54 features), CoreBoot (58 features), and EmbToolkit (53 features) have proportionally many dead features.

Finally, since we have not performed interviews with the language designers, we might have misunderstood the original intention of certain language concepts and of actual features in the models. For example, the feature themes were determined by manual model

7.3. Threats to Validity

analysis, and the corresponding author could be biased classifying features according to a theme. On the other hand, these themes are based on a discussion and consensus among

our co-authors from [BSL+12].

7.3.2. Software Ecosystems

Threats to external validity. We have purposely selected a wide range of open platforms

for comparison with the closed platforms, to increase the generality of our conclusions. One may question their comparability, as they exhibit diverse technologies, abstraction levels, and granularities of units. It is also not given that the studied subjects are representative for open platforms in general. We mitigate this threat by using an exploratory research method: instead of testing hypotheses, we record observed phenomena and generate hypotheses. Further, we limit data sources to reliable documents, freely available source code, and tools. Confronting our results with other data, such as developer interviews, would be a valuable future project.

Specifically, the dependencies seem difficult to compare between the ecosystems with variability model and those without—the relevance of declared dependencies might differ among our subjects. For example, Android apps are rather self-contained and bundled with libraries, whereas Debian and Eclipse invest a significant effort into reducing code duplication by providing common library packages as units and making dependencies explicit. Still, all these numbers indicate scalability requirements for tools, such as configurators and installers, and in that sense (algorithmic hardness) are useful standalone and, to a large extent, comparable.

Threats to internal validity. In the quantitative ecosystems analysis, some numbers

are estimated using interpolations and safe assumptions (lower bounds) and may be inaccurate. We address this threat by giving detailed information on our data sources, providing additional diagrams (Appendix B.5) and implementation details on the Android analysis (Appendix B.4).

The analysis of dependencies in Debian and Eclipse disregards dependencies on partic- ular unit versions that may impact accuracy. We believe this simplification is acceptable, as such dependencies are mainly used to assist system upgrades, not in scope of our work. All ecosystems except Android declare dependencies. It is not clear whether our extracted—via static analysis—dependencies for Android are comparable to declared dependencies—in fact it is subject of ongoing research, whether actual and declared dependencies are generally comparable or not. Therefore, we avoid comparing dependency numbers for Android to other systems.

Finally, since the platforms show significant differences both in scope and number of developers, one might question their comparability to each other. For instance, Debian with over 1000 developer is in a better position to implement cross-cutting changes to its repository than eCos, which is driven by a handful of volunteers. Investigation of how the employed processes affect the collected data is left for further research.

7. Discussion and Outlook

In document Catálogo de instituciones v. 3.20 (página 134-140)