• No se han encontrado resultados

7. PRESENTACIÓN Y ANÁLISIS DE RESULTADOS

7.2 Análisis triangulación de la información

7.2.2 Análisis triangulación de la Subcategoría Creencias

The artefacts handled by the framework are originally created in a variety of tools. Since certain artefact types are not readily available from open source repositories, sample artefacts are needed to provide a wide variety of representations to demonstrate framework implementation, and subsequently its evaluation. For this reason, a sample natural language requirements specification and multiple UML class diagrams were created.

Requirements specifications can be written using a number of word processing applications and requirements management tools [170]. Since the main criterion of selecting a tool is the ease of access it provides to its artefact data and no other constraints are present, for the purposes of the framework, open source word processing solutions are considered. The selected tool for this work

isOpenOffice Write6, since a number of APIs can be used to access and manipulate documents created in.odtformat. In particular, theApache ODF Toolkit7returns text contained within the document as a single string, where the required elements can be selected and manipulated. Other sample artefacts used in the framework are extracted from open source repositories, where their file format is already given. A wide variety of UML diagramming tools are available [171]. For the purposes of the framework, a suitable UML tool proved to be Dia8, which supports exporting to various formats, including XML-based ones. Java source code can be written using a wide range of tools: from simpletext editorstoIDEs, includingNetBeans9,IntelliJ IDEA10

andEclipse11. IDEscan also be used to create JUnit test cases.

6.4.1.2 Extraction

The data extraction process involves exporting data from the selected tools manually, as is the case with Dia and UML class diagrams, or programmatically. Requirements specifications related artefact data can be obtained using the aforementionedApache ODT Toolkit APIfrom Java. Options for source code extraction are greater and various solutions were considered includingJavaML12, the XML vocabulary for representing Java source code, andBeautyJ13, which converts Java source code to XJava XML. Eventually, the lightweight command line tool,

srcML, was selected, which allows the creation of an XML representation of Java, C/C++ and C#

source code by combining source code (text) and AST information (markup tags) [172]. Using the tool, it is possible to perform a one-to-one mapping of.javafiles to.java.xml. Each artefact is mapped to physical files in a repository or file system differently. While a UML class diagram may be represented by a single.diafile, Java source code and JUnit artefacts are a composite of multiple.javafiles and therefore are extracted to multiple.java.xmlfiles.

The extraction process and the output for each artefact type are illustrated in Figure 6.11. The extracted files are stored in the framework folder (ACMF), which is specified when the framework is first setup. TheACMF folder contains the following sub-folders: ArchitectureConceptual, ArchitectureModuleView, Requirement, SourceCode, UMLClass, UMLSequence, UMLUseCase,

andUnitTests. Each artefact is extracted to its corresponding folder based on its type.

6https://www.openoffice.org/ 7http://incubator.apache.org/odftoolkit/simple/ 8https://wiki.gnome.org/Apps/Dia 9https://netbeans.org/ 10https://www.jetbrains.com/idea/ 11https://eclipse.org/ 12http://paginas.fe.up.pt/ aaguiar/javaml/ 13https://sourceforge.net/projects/beautyj.berlios/

Figure 6.11: Artefact data extraction.

6.4.2

Transformation

Following the extraction, transformation aims to map heterogeneous XML documents to a uniform representation. Specifying a uniform representation allows artefact and link data to ultimately be saved in the data store, where they are represented as a property graph. The transformation functionality has raised numerous implementation-level considerations. Firstly, the format and the schema of the uniform representation is selected. Secondly, the strategy for establishing trace links using this representation is considered.

In terms of formats, the first alternative considered was a custom XML schema to represent artefacts of various types. According to this schema, both artefact elements and relations are uniquely identified and properties of both entities can be expressed through custom elements. The main advantages of this approach are the flexibility offered by XML and the freedom to specify the custom schema. However, adopting a custom XML-based solution involves handling issues that are already addressed by formats readily available to represent property graph concepts. These include the identification of elements, granularity of information, the ability to store generic data effectively, directionality of links, and most importantly, issues related to linking elements. One consideration is whether link information should be stored in the graph XML file or as a separate file.

6.4.2.1 Transformation: GraphML

A comprehensive survey of graph exchange formats reveals that a number of file formats are available to model, store and exchange graph data [173]. A few examples include theGraph

Modelling Language (GML)14, theGraph eXchange Language (GXL)15, and theLEMON Graph

14https://www.fim.uni-passau.de/fileadmin/files/lehrstuhl/brandenburg/projekte/gml/gml-technical-report.pdf 15http://www.gupro.de/GXL/Introduction/intro.html

Format (LGF)16. To unify heterogeneous artefact and link data, GraphML[174] was chosen, which is used to describe graph structures and to represent application specific data.

<graphml >

<key attr.name="name" attr.type="string" for="node" id="d0"/>

<key attr.name="visibility" attr.type="string" for="node" id="d1"/> <key attr.name="variableType" attr.type="string" for="node" id="d2"/> <key attr.name="parameters" attr.type="string" for="node" id="d4"/> <key attr.name="returnType" attr.type="string" for="node" id="d5"/> <key attr.name="type" attr.type="string" for="node" id="d6"/>

<key attr.name="relType" attr.type="string" for="edge" id="d7"/> <key attr.name="uniqueId" attr.type="string" for="node" id="d8"/>

<graph edgedefault="undirected" id="DI"> <node id="1">

<data key="d0">Account </data> <data key="d1">Public </data> <data key="d2"/>

<data key="d6">class</data>

<data key="d8">Unique id value </data> </node>

<node id="2">

<data key="d0">getAccountNo </data> <data key="d1">Public </data>

<data key="d5">String </data> <data key="d4"/>

<data key="d6">UMLOperation </data>

<data key="d8">di1/Users/ildikopete/Dropbox/PhD/SharedBackup/Evaluation/ MazeSolver/Evaluation Files/UML/Revision19/XML/OldVersion/

revision19Old.vdx</data> </node>

<node id="3">

<data key="d0">balance </data> <data key="d1">Private </data> <data key="d2">int</data>

<data key="d6">UMLAttribute </data>

<data key="d8">di22/Users/ildikopete/Dropbox/PhD/SharedBackup/Evaluation /MazeSolver/Evaluation Files/UML/Revision19/XML/OldVersion/

revision19Old.vdx</data> </node>

<edge id="diE2" source="1" target="2"> <data key="d1">Parent_Child </data> </edge>

<edge id="diE2" source="1" target="3"> <data key="d1">Parent_Child </data> </edge>

</graph> </graphml >

Listing 6.2:Example GraphML file modelling a UML class diagram and its property graph representation.

GraphML is best introduced through a concrete example. Listing 6.2 shows aGraphMLfile representing a UML class diagram, which is depicted in Figure 6.12. The properties of graph nodes and edges are derived based on the process introduced in Section 6.3.1. Artefact element properties are defined by thekeyelement in theGraphMLfile. Keyshaveidentifiers,names,types

and adomain attributespecifying the element the given property is assigned to, as properties can be associated with edges, nodes or both. Nodeelements denote graph nodes, whileedgeelements stand for graph edges. The values of artefact element properties are defined bydataelements nested inNodeelements, whereas edge properties are specified indataelements insideEdge

elements. Other artefact types with their corresponding elements, properties and connections are described in a similar manner inGraphMLusing the appropriate property keys and their values.

Figure 6.12:The Account UML class and its members.

Table 6.2 summarises the properties used in the framework describing the current set of selected artefacts. Should the framework be extended with new artefact types, further properties can be added. When adding new properties, a convention to be taken into account is thatGraphML key

element names are reserved to denote existing properties, and they should not be overridden by new ones. For example, regardless of the artefact type,D8should always stand for theunique id

property.

6.4.2.2 Transformation: XSLT

The transformation functionality is implemented using XSLT transformations. Alternatives considered include the DOM17, SAX18 and JAXP Java parsers19. Each artefact type has a corresponding XSLT file, which transforms the XML-based extracted artefact data to the custom

GraphMLschema specified in Listing 6.2. The XSLT approach has proved to be a flexible one,

as it allows the extension of the framework without major refactoring should new artefacts be added. In case a new artefact is introduced, its corresponding XSLT has to be supplied.

17https://docs.oracle.com/javase/tutorial/jaxp/dom/readingXML.html 18https://docs.oracle.com/javase/tutorial/jaxp/sax/parsing.html

Property Key Value Description Example Values

Name The name of the artefact element

as specified in the original tool d0 getAccountNo, Account, balance Visibility Visibility modifier d1 Public, Private, Protected

VariableType The data type of a variable

(field / attribute) d2 Int, String, Object Parameters Parameter list

(methods / operations) d4 (int a, String b) ReturnType Return value

(methods / operations) d5 Void, String, Int Type The type of the entity

specified in the framework d6 Method, Enum, Component RelType The type of the relationship

specified in the framework d7 Parent_Child, Uses UniqueId An identifier generated to

uniquely identify artefact elements d8 SC0D:/file.graphml Content The body of a member element

(methods / requirements) d3 return null;

"The system shall..." Title The title of a requirement d11 "User input"

Priority The priority value

associated with a requirement d12 High ReqType The type of a requirement d13 Functional

Table 6.2: Property key/value pairs used in the framework.

6.4.2.3 Transformation Output

The output of the transformation process is summarised in Table 6.3, which highlights the way original artefacts are mapped to the uniformGraphMLformat. Since a source code repository may contain hundreds of .javafiles, the mapping process of Java source code and JUnit test artefacts differs from other types. Each.javafile is transformed to aGraphMLrepresentation as combining all Java source code or JUnit test artefact data to a single file may not be viable due to the possible overhead caused by handling large files. Despite the separate storage, theGraphML

files are logically the same artefact.