2. Marco Teórico
3.6 Técnicas e instrumentos de recolección de datos
In Tutorial D syntax, wherever a commalist of items is required it is permissible for that list to be empty unless it is explicitly stated to the contrary. At the time of writing there are no exceptions; therefore the following two expressions must both be legal:
r { ALL BUT }
r { }
where r denotes a relation.
The first should come as no surprise—it obviously results in r itself—but the second does surprise most students at first, for it appears to denote a relation with no attributes at all—a relation of degree zero. And indeed it does, and indeed there are such relations!—but only two. The two relations have been given the pet names TABLE_DEE and TABLE_DUM and these names, which were proposed by the present author in a journal article published in 1988, are available in Tutorial D, which allows them to be abbreviated to just DEE and DUM.
TABLE_DEE is a name for the relation RELATION { TUPLE { } }—the relation of degree zero and cardinality one. There is only one tuple of degree zero, so that has to be the only tuple of TABLE_DEE. Clearly there cannot be a relation of degree zero and cardinality greater than one, for then we would have the same tuple appearing more than once in the body of a relation and that, as we have seen, cannot be. So TABLE_DUM must be the empty relation of degree zero: RELATION { } { } (the first { } specifies the empty heading, the second the empty body).
A predicate represented by a relation of degree zero is niladic (has no parameters). In other words, it must be a proposition, p. If TABLE_DEE represents p, then p is true; otherwise TABLE_DUM represents
p and p is false. People often ask, “What purpose can relations of degree zero possibly serve? They seem to be of little or no value.” The answer is that they represent answers to queries of the form “Is it true that…?” or “Are there any…?” where the answer is just “yes” or “no”. For example, “Is it true that student S1 is enrolled on course C3?”, and “Are there any students enrolled on course C1?”
Now that you have met TABLE_DEE I can show you one more interesting property of JOIN. If r is, as usual, a relation, what is the result of r JOIN TABLE_DEE? Even if an answer springs to mind immediately I suggest you work this out for yourself from the definition of JOIN—and perhaps verify your conclusion using Rel.
If a value i exists such that whenever i is one of the operands of a dyadic operator the result of invoking that operator is the other operand, then i is said to be an identity value under that operator. Think of the number 0 under addition, for example, and the number 1 under multiplication. TABLE_DEE is the identity value under JOIN.
Download free eBooks at bookboon.com
105
Now, consider an operator that is commutative and associative, as are numerical addition and relational JOIN. As we have seen, a language can support n-adic versionsof such operators: we can take the sum of any number of numbers and we can take the join of any number of relations—even just one or none at all. The sum of just one number is that number and the join of just one relation is that relation. The sum of no numbers is zero, because zero is the identity value under addition. The join of no relations is TABLE_DEE.
So far I have shown you relational counterparts of AND and existential quantification. Eventually you will see counterparts of OR and NOT too but we haven’t finished with AND yet, for JOIN turns out not to be suitable for certain common special cases of AND. The next operator addresses one of those cases and is called restriction.
4.7
Restriction and AND
Here is a predicate that we can derive from the predicate for IS_CALLED by substitution of one of its parameters:
Student StudentId is called Boris.
The relation representing that can be obtained quite easily using JOIN and projection, noting that the given predicate is equivalent to the more elaborate
There exists a name Name such that student StudentId is called Name and Name is Boris. Again the only parameter is StudentId, because Name is quantified. The relation is denoted by the expression shown in Example 4.6.
Example 4.6: JOIN and projection
( IS_CALLED JOIN RELATION { TUPLE { Name NAME ( 'Boris' ) } } ) { StudentId }
Restriction, invoked using the key word WHERE, gives us an alternative and perhaps more intuitive formulation for the first line of Example 4.6, shown in Example 4.7.
Example 4.7: Restriction
( IS_CALLED WHERE Name = NAME ( 'Boris' ) )
Here the word WHERE is preceded by a relational expression and followed by a condition, Name = NAME ( 'Boris' ). Each tuple of the specified relation is tested to see if it satisfies the given condition. Those that do satisfy it, and only those tuples, appear in the result.
Download free eBooks at bookboon.com
106
Notice that the expression Name = NAME ( 'Boris' ) is not one that can be evaluated outside of the context in which it appears. Its evaluation depends on the existence of a tuple to provide the value of the attribute Name. We shall refer to such expressions as open expressions. Unsurprisingly, we shall refer to expressions that can be evaluated independently of their context as closed expressions. We shall soon see that restriction isn’t the only context in which open expressions can appear.
Of course the WHERE condition doesn’t have to be an open expression; but if it is closed, then its value is independent of the tuples of the operand relation and is therefore the same for each tuple. As a consequence, the result of a restriction whose condition is a closed expression is either the input relation (the WHERE condition evaluates to TRUE) or empty (it evaluates to FALSE).
Now, the reason why the example at hand can be formulated using JOIN instead of WHERE lies in the fact that the restriction condition is, very specifically, an “equals” comparison of an attribute with a literal. The literal in question can be “wrapped up”, so to speak, as a relation literal that can be used as an operand of JOIN. If the condition is less restrictive, the required relation literal might be far too large to be written down. Suppose, for example, that we wish to find all the students whose names begin with the letter “B”. Then we have to replace the relation literal in Example 4.6 by one that includes a tuple literal for every value of type NAME that begins with the letter “B”. That is out of the question. But, using the STARTS_WITH operator from Rel’s OperatorsChar.d (recall that we used this operator in the definition of type SID in Chapter 2), the task becomes very easy using WHERE, as shown in Example 4.8.
Example 4.8: A more useful restriction
IS_CALLED WHERE STARTS_WITH(THE_C(Name), 'B') Definition of restriction
Let s = rWHEREc, where c is a possibly open truth-valued expression denoting a condition on attributes of r. Then:
• The heading of s is the heading of r.
• The body of s consists of those tuples of r for which the condition c evaluates to TRUE.
So the body of s is a subset of the body of r. Note that the result of r WHERE TRUE is r and that of
r WHERE FALSE is the empty relation of the same type as r. In fact, whenever the specified condition
c is a closed expression the result of r WHERE c is empty when c evaluates to FALSE and is otherwise equal to r. So closed expressions are rarely useful and in practice c is nearly always an open expression.
Download free eBooks at bookboon.com
Click on the ad to read more
107
Now let us return to Example 4.3, finding pairs of students who have the same name. It was annoying to find those tuples that pair students with themselves. Now we know how those can be eliminated:
( ( IS_CALLED RENAME { StudentId AS Sid1 } ) JOIN
( IS_CALLED RENAME { StudentId AS Sid2 } ) WHERE NOT (Sid1 = Sid2) ) { Sid1, Sid2 } which yields the relation shown in Figure 4.8.
Sid1 Sid2
S2 S5 S5 S2
Download free eBooks at bookboon.com
108
If we are still annoyed by seeing two different students paired together twice, we might be able to address that problem too, if a comparison operator such as “less than” is available on values of type SID:
( ( IS_CALLED RENAME { StudentId AS Sid1 } ) JOIN
( IS_CALLED RENAME { StudentId AS Sid2 } ) WHERE Sid1 < Sid2 ) { Sid1, Sid2 }
Assuming that the value SID('S2') precedes SID('S5') in the ordering defined for values of type SID, this would yield the singleton relation shown in Figure 4.9.
Sid1 Sid2
S2 S5
Figure 4.9: A further improvement on Figure 4.7
Now please look again at Example 4.8. It involves computation of the first letter of every student’s name, for testing in the WHERE condition. Sometimes we wish to use such computations to obtain values that are to appear as attribute values in some relation. Suppose, for a rather unreal example, that we wish to obtain a relation showing not only the student identifier and name of each student, but also the first letters of their names. Then we will need our next operator, also related to AND, called extension (a slightly unfortunate name, perhaps—not to be confused with extensions of predicates as defined in Chapter 3!).
4.8
Extension and AND
Consider, then, the predicate
Student StudentId is called Name and Name begins with the letter Initial.
We have AND connecting the predicate for IS_CALLED with “Name begins with the letter Initial”. As with Example 4.8, it is not feasible to write down a relational literal representing “Name begins with the letter Initial”, so we cannot feasibly use JOIN.
Here is a Tutorial D formulation using extension:
EXTEND IS_CALLED : { Initial := FirstLetter ( Name ) }
Here FirstLetter ( Name ) is an open expression—the expression needs to be evaluated against a tuple that provides an attribute value for Name. The expression is evaluated for each tuple t of IS_CALLED, yielding the tuple formed by “extending” t by the attribute Initial having the value of that open expression.
Download free eBooks at bookboon.com
109