4. Análisis de Resultados
4.2 Resultados de la Fase 1: Diseño de la base de conocimientos
5.1
Introduction
The relational operators described in Chapter 4 constitute a relationally complete set; but for practical purposes we need more. This chapter describes additional operators, defined for Tutorial D, which have been suggested as desirable by various people over the years. The additional relational operators are all defined in terms of those defined in Chapter 4.
The relvars IS_CALLED and IS_ENROLLED_ON have been sufficient for illustrative purposes so far but now we need to extend our example database slightly, as shown in Figure 5.1.
COURSE CourseId Title C1 Database C2 HCI C3 Op systems C4 Programming EXAM_MARK
StudentId CourseId Mark
S1 C1 85 S1 C2 49 S2 C1 49 S3 C3 66 S4 C1 93
Figure 5.1: Current values of relvars COURSE and EXAM_MARK
The predicate for COURSE is “Course CourseId is entitled Title.” The predicate for EXAM_MARK is “Student
StudentId sat the exam for course CourseId and scored Mark marks for that exam.” The Tutorial D definitions for these relvars are
VAR COURSE BASE RELATION { CourseId CID, Title CHAR } KEY { CourseId };
VAR EXAM_MARK BASE RELATION { StudentId SID, CourseId CID, Mark INTEGER }
Download free eBooks at bookboon.com
122
We start with a couple of simple operators, based on ones already described in Chapter 4, that provide convenient shorthands (so called because what they express can be expressed more laboriously, and perhaps less clearly, without the use of the additional relational operators).
5.2
Semijoin and Composition
Consider the predicate, “At least one student sat the exam for Course CourseId, entitled Title”—more precisely: “There exist a student StudentId and a mark Mark such that StudentId sat the exam and scored
Mark marks for course CourseId and CourseId is entitled Title.”
The relation currently representing this predicate can be derived from the join of COURSE and EXAM_MARK by “projecting away” the attributes corresponding to those quantified parameters,
StudentId and Mark:
( COURSE JOIN EXAM_MARK ) { ALL BUT StudentId, Mark } or, equivalently,
COURSE JOIN ( EXAM_MARK { ALL BUT StudentId, Mark } )
In either case the JOIN is indicated for us by the “and” in the expanded version of the predicate and the projection is indicated by the quantification. However, looking at the short form of the predicate we may more intuitively think of its relation as consisting of just those tuples of COURSE that have at least one matching tuple in EXAM_MARK. Now we may recall from Chapter 4 that we can find all the tuples of COURSE that do not have a matching tuple in EXAM_MARK by using semidifference:
COURSE NOT MATCHING EXAM_MARK
and I indicated briefly that Tutorial D also allows you to omit the word NOT, with the obvious effect:
COURSE MATCHING EXAM_MARK
MATCHING, without the NOT, is Tutorial D’s operator name for semijoin, so called because a semijoin can be perceived, very loosely, as being “half a join”. We join two relations but in the result retain only the attributes of the first operand (by excluding the non-common attributes of the second). That result is shown in Figure 5.2. It contains every tuple of COURSE apart from the tuple for course C4, whose exam no student sat.
CourseId Title
C1 Database C2 HCI C3 Op systems Figure 5.2: COURSE MATCHING EXAM_MARK
Download free eBooks at bookboon.com
123
Now, perhaps, you can see why somebody once chose “semidifference” for the operator of that name, even though it can hardly be characterized as “half a difference”: the name “semijoin” had already entered the jargon. (No, there isn’t a semiunion!)
Definition of MATCHING
r1 MATCHING r2, where r1 and r2 are relations such that r1 JOIN r2 is defined, is equivalent to ( r1 JOIN r2 ) { r1-attrs }
where r1-attrs is a commalist containing all and only the attribute names of r1.
Points to note (compare with those given for NOT MATCHING in Chapter 4):
• Recall that JOIN is not defined for all pairs of relations. If r1 and r2 have attributes of the same name but different types, then r1 JOIN r2 is not defined. A similar proviso applies to MATCHING and several other dyadic operators.
• The body of the result is, as with restriction, a subset of that of the first operand. It follows that if r1 is empty, then so is the result.
• If r1 and r2 have no common attributes, then the result is empty in the case where r2 is empty and is otherwise equal to r1 (recall that tuples having no common attributes are considered to be matching tuples).
• As the definition shows, semijoin is not needed as a primitive operator. As explained in Chapter 4, we could have chosen difference in place of semidifference as our primitive operator to support logical negation, but we preferred semidifference for its more general availability. Had we chosen difference instead, then our descriptions of NOT MATCHING and MATCHING could have appeared, neatly, side by side in the present chapter.
There is another operator, advocated by some people as being useful enough to warrant its inclusion, that is based, like semijoin, on JOIN and projection. It is called composition.
Consider the predicate “Student StudentId scored Mark marks in the exam for a course entitled Title.” The corresponding relation must have attributes StudentId, Mark, and Title. The first two would clearly be derived from EXAM_MARK, the third from COURSE.
Notice the indefinite article in that predicate: “a course”, not “the course”. Here we can replace the word “a” by “some” without changing our meaning at all, indicating that existential quantification is lurking under the covers, so to speak, in our informal predicate. We can bring that quantification out into the open, as I did with the example I used for semijoin: “There exists a course CourseId such that CourseId
Download free eBooks at bookboon.com
Click on the ad to read more
124
The relation representing this predicate can be derived from the join of COURSE and EXAM_MARK by “projecting away” CourseId (which happens to be the only common attribute):
( COURSE JOIN EXAM_MARK ) { ALL BUT CourseId }
Composition gives us a shorthand that saves us from having to write that projection when its purpose is to exclude all and only the common attributes of the two operand relations. In Tutorial D, therefore, we can achieve the same effect more conveniently by
COURSE COMPOSE EXAM_MARK
The result is shown in Figure 5.3.
Title StudentId Mark
Database S1 85 HCI S1 49 Op systems S1 85 Database S2 49 Op Systems S3 66 Database S4 93 Figure 5.3: COURSE COMPOSE EXAM_MARK
Download free eBooks at bookboon.com
125