CURIOCITY Framework: General Architecture

Chapter 1 Introduction

6.1 CURIOCITY Framework: General Architecture

CURIOCITY Framework is conceptualized in a three layer architecture, as shown in Figure 6.1:

1. Semantic Repositorylayer, aimed at managing the semantic data, related to cultural heritage, to tourists (i.e., basic information, preferences, interests), to application domains (e.g., robotics, government services), etc.

1https://github.com/giulianodelagala/CURIOCITY

58 Universidad Cat´olica San Pablo

3 KDVW\SH

3KDVDOWHUQDWLYHIRUP 3 KDVSURGXFHG 3KDVW\SH

3 FDUULHG RXWE\

3KDVSUHIHUUHG LGHQWLÀHU 3KDV FXUUHQWRZQHU

3KDVWLPHVSDQ 3 KDVW\SH

3KDVGLPHQVLRQ 3 KDVXQLW 3 KDVYDOXH

3 FRQVLVWVRI 3KDVQRWH

3KDV FXUUHQWORFDWLRQ 3KDVFRQGLWLRQ 3KDV EURDGHUWHUP

3DWVRPHWLPHZLWKLQ

3 KDV WLPHVSDQ 3FXVWRG\ UHFHLYHGE\

3KDVW\SH 3KDV WLPHVSDQ 3FXVWRG\ VXUUHQGHUHGE\

3 DWVRPH WLPH ZLWKLQ

3IDOOVZLWKLQ

3 IDOOV ZLWKLQ 3DVVLJQHG

3KDV WLPHVSDQ

3 DWVRPH WLPH ZLWKLQ 3KDVFXUUHQWNHHSHU FLWKDV GHVFULSWLRQ 3KDVQRWH

(3ODFH $UTXHRORJtD\ (WQRORJtD(7LPH 6SDQ G&

(3HULRG ,QWHUPHGLR 7HPSUDQR (,GHQWLÀHU PBBVB

(,GHQWLÀHU (*URXS 0XVHR 0XQLFLSDOGH $UHTXLSD

(7\SH ,'587$6

(7\SH 'RQDFLyQ (&RQGLWLRQ6WDWH (VWDGR$FWXDOGH &HVWRGHOD &XOWXUD1D]FD

(0DWHULDO 9HJHWDOHV (7\SH %XHQR

(+XPDQ0DGH 2EMHFW &HVWRGHOD &XOWXUD1D]FD (7\SH &RQWHQHGRU GHREMHWRV

('LPHQVLRQ $OWXUDGH&HVWR GHOD&XOWXUD 1D]FD (6WULQJ 7pFQLFDGH HQUROODGR(7\SH 8WLOLWDULRV

(1XPEHU

(3HUVRQ (GPXQGR (VFRPHO

(7UDQVIHURI &XVWRG\ 'RQDWLYRGH &HVWRGHOD &XOWXUD1D]FDFLW'HVFULSWLRQ /LQNGH,QWHUHV (6WULQJ KWWSVZZZ (3ODFH /RFDOL]DFLyQGH &HVWRGH&XOWXUD 1D]FD

(3ODFH 6DOD&XOWXUD 3UHKLVSiQLFDV

(,GHQWLÀHU $VVLJQPHQW ,QJUHVRD6LVWHPD &85,2&,7<

(7LPH6SDQ )HFKDGH,QJUHVRD 6LVWHPD ( 0HDVXUHPHQW 8QLW FP

(7LPH6SDQ )HFKDGH,QJUHVRD /RFDO3OD]D6DQ )UDQFLVFR

(7LPH 3ULPLWLYH

(7LPH 3ULPLWLYH (3URGXFWLRQ 3URGXFFLyQGH &HVWRGHOD &XOWXUD1D]FD(*URXS &XOWXUD 1D]FD

(7LPH 3ULPLWLYH G&

Figure 6.3: Mapping museum data to CURIOCITY - CIDOC CRM

62 Department of Computer Science infer triples (e.g., those that can be inferred by inverse properties, inherited properties, or defined by inference rules). These new inferred concepts and properties can be inserted to the graph (Figure 6.2, step 5). This graph is finally serialized and saved to a file.

During the population process design, some drawbacks were detected and overcome:

❼ Incomplete data: some fields of museum data are identified as Unknownor Missing Data; for these objects, a ”Desconocido” (Unknown) instance is assigned to their corresponding concepts; thus, they can be later easily identified to complete the missing information.

❼ Fields that can refer to different classes (e.g., the author of a work can be a crm:Person or crm:Group); they demand user intervention to specify the corresponding class through a dialog box.

❼ Homonymy problems: a posteriori review by specialists is necessary to correct these errors in the semantic repository.

❼ Uncertainty and unclear data: data collected can be inaccurate, present typos or even corrupted, which lead mostly to concept and property duplicity. We added a ”Verified artifact” flag in order to have control about data accuracy checked by specialist review.

IRI’s Naming Convention

An appropriate naming convention for IRIs allows a better control and understanding of the graph generated, i.e., it improves the readability of RDF serialization, especially in TURTLE format. Additionally, this convention facilitates the formulation and processing of queries, by being able to directly insert known IRIs through the use of clauses such as BIND, or the replacement of the subject or object in the comparison patterns, as shown in the next Subsection.

Table 6.1 shows the naming convention, where id is a unique identifier for each cultural heritage object to be instantiated, consisting of letters, numbers and valid sym- bols as URI (e.g., m 01 s01); and filename is the name of the resulting file from the digitization process.

Table 6.1: IRI’s naming convention

IRI rdfs:type Description

:id ecrm:E42 Identifier Object Unique Id (assigned by instantiation) :id/IDalternative ecrm:E42 Identifier Object alternative Id (assigned by museum) :id/Object ecrm:E22 Man-Made Object Human-made artifact

:id/Production ecrm:E12 Production Artifact’s production event :id/Utility ecrm:E55 Type Artifact’s utility

:id/Current Condition ecrm:E3 Condition State Artifact’s current condition or state :id/Measurements ecrm:E54 Dimension Artifact’s physical measurements :id/Acquisition ecrm:E10 Transfer of Custody Artifact’s acquisition event

:Processfilename :D2 Digitization:Process Digital media creation process event

62 Universidad Cat´olica San Pablo

SPARQL queries

TheData Processinglayer receives queries from theApplicationlayer that are transformed to SPARQL queries and processed at the Semantic Repository layer. Query results are returned in JSON format back to the Application layer. The reasoner integrated at the Semantic Repository layer, gives the benefit that queries can obtain inferred information.

The Data Processing layer basically receives four types of queries:

❼ Search queries that allow the extraction of data from the graph; they are considered to be of highest frequency among all types of queries.

❼ Insertion queries which are considered to be a medium frequency usage for the case of data insertion (e.g., user, ranking) and a low frequency usage for the case of data related to cultural heritage elements, such as artifacts, authors; insertion queries must be based on IRIs convention.

❼ Update queries which are intended to modify existing data in the graph; they are considered as a low frequency usage.

❼ Delete queries intended to remove existing data from the graph; they have a very low frequency usage. They would only be used by administrators on an occasional basis.

Update and delete queries can introduce inconsistency to the graph; therefore it is necessary to analyze their implementation with special care.

6.1.3 Application Layer

CURIOCITY Framework offers different kind of services and applications, roughly classi- fied as visitor and admin applications. Visitor apps, are intended to allow users to browse the semantic repository, perform queries, and obtain details of instances contained in the semantic repository (e.g., catalogs, virtual museums). Admin applications are oriented to the management of theSemantic Repository, allowing utilizing the scripts provided by the Data Processing layer, to carry tasks, such as configuring the mapping and updating the semantic repository.

In document facultad de ingeniería y computación (página 71-76)