ARGUMENTAL*
4. Restricciones de selección y proyección argumental
Groups I
Group theory was invented by E. Galois (1811–1832) in order to solve one of the premiere mathematical problems of his day: when can the roots of a poly-nomial be found by some generalization of the quadratic formula? Since Galois (who was killed in a duel when he was only 20 years old), group theory has found many other applications. For example, we shall give a new proof of Fermat’s the-orem (if p is prime, then ap ≡ a mod p), and this proof will then be adapted to prove a theorem of Euler: if m ≥ 2, then aφ (m) ≡ 1 mod m, where φ(m) is the Euler φ-function. We will also use groups to solve counting problems such as: How many different bracelets having 10 beads can be assembled from a pile containing 10 red beads, 10 white beads, and 10 blue beads? In Chapter 6, we will illustrate the fact that groups are a precise way to describe symmetry by classifying all possible friezes.
2.1 S
OMES
ETT
HEORYWe are going to study algebraic systems called groups, which involve objects that can be “multiplied” and rings, which involve objects that can be multiplied and added. There are interesting examples of these systems whose elements are functions, but, more importantly, certain functions (called homomorphisms) arise in comparing such systems. This section contains definitions and basic properties of functions, but the reader may skim this section now and return to it later when necessary.
A set X is a collection of elements (numbers, points, herring, etc.); one writes x ∈ X
to denote x belonging to X . The terms set, element, and belongs to are undefined terms (there have to be such in any language), and they are used so that a set is
80
determined by the elements in it.1Thus, we define two sets X and Y to be equal, denoted by
X = Y,
if they are comprised of exactly the same elements: for every element x , we have x ∈ X if and only if x ∈ Y .
A subset of a set X is a set S each of whose elements also belongs to X : if s∈ S, then s ∈ X. One denotes S being a subset of X by
S⊆ X;
synonyms for this are S is contained in X and S is included in X . Note that X ⊆ X is always true; we say that a subset S of X is a proper subset of X, denoted by S X , if S⊆ X and S 6= X. It follows from the definitions that two sets X and Y are equal if and only if each is a subset of the other:
X= Y if and only if X⊆ Y and Y ⊆ X.
Because of this remark, many proofs showing that two sets are equal break into two parts, each half showing that one of the sets is a subset of the other. For example, let
X= {a ∈ : a ≥ 0} and Y = {r2 : r ∈ }.
If a ∈ X, then a ≥ 0 and a = r2, where r = √
a; hence, a ∈ Y and X ⊆ Y . For the reverse inclusion, choose r2 ∈ Y . If r ≥ 0, then r2 ≥ 0; if r < 0, then r = −s, where s > 0, and r2 = (−1)2s2 = s2 ≥ 0. In either case, r2 ≥ 0 and r2∈ X. Therefore, Y ⊆ X, and so X = Y .
Definition. The empty set is the set having no elements.
We claim, for every set X , that ⊆ X. The negation of “If s ∈ , then s ∈ X” is “There exists s ∈ with s /∈ X;” as there is no s ∈ , however, this cannot be true. It follows that there is a unique empty set, for if 1were a second such, then ⊆ 1and, similarly, 1⊆ . Therefore, = 1
Here are some ways to create new sets from old.
Definition. If X and Y are subsets of a set Z , then their intersection is the set X∩ Y = {z ∈ Z : z ∈ X and z ∈ Y }.
1There are some rules governing the usage of∈; for example, x ∈ a ∈ x is always a false statement.
X X
Y Y
Figure 2.1 X∩ Y X∪ Y
More generally, if{Ai : i ∈ I } is any, possibly infinite, family of subsets of a set Z , then their intersection is
\
i∈I
Ai = {z ∈ Z : z ∈ Ai for all i ∈ I }.
It is clear that X ∩ Y ⊆ X and X ∩ Y ⊆ Y . In fact, the intersection is the largest such subset: if S ⊆ X and S ⊆ Y , then S ⊆ X ∩ Y . Similarly, T
i∈I Ai ⊆ Aj for all j ∈ I .
Definition. If X and Y are subsets of a set Z , then their union is the set X∪ Y = {z ∈ Z : z ∈ X or z ∈ Y }.
More generally, if{Ai : i ∈ I } is any, possibly infinite, family of subsets of a set Z , then their union is
[
i∈I
Ai = {z ∈ Z : z ∈ Ai for some i ∈ I }.
It is clear that X ⊆ X ∪ Y and Y ⊆ X ∪ Y . In fact, the union is the smallest such subset: if X ⊆ S and Y ⊆ S, then X ∪ Y ⊆ S. Similarly, Aj ⊆ S
i∈I Ai for all j∈ I .
Definition. If X and Y are sets, then their difference is the set X− Y = {x ∈ X : x /∈ Y }.
The difference Y − X has a similar definition and, of course, Y − X and X − Y need not be equal.
In particular, if X is a subset of a set Z , then its complement in Z is the set X0= Z − X = {z ∈ Z : z /∈ X}.
Y X
Figure 2.2 X− Y
Y X
Figure 2.3 Y− X
It is clear that X0is disjoint from X ; that is, there is no element z∈ Z lying in both X and X0, so that X ∩ X0 = . (Thus, the empty set is needed to guarantee that the intersection of two subsets A and B always be a subset; that is, A∩ B should always be defined.) In fact, X0is the largest subset of Z disjoint from X : if S⊆ Z and S ∩ X = , then S ⊆ X0.
Functions
The idea of a function occurs in calculus (and earlier); examples are x2, sin x ,
√x , 1/x , x+ 1, ex, etc. Calculus books define a function f (x ) as a “rule” that assigns, to each number a, exactly one number, namely, f (a). Thus, the squaring function assigns the number 81 to the number 9; the square root function assigns the number 3 to the number 9. Notice that there are two candidates for √
9, namely, 3 and−3. In order that there be exactly one number assigned to 9, one must select one of the two possible values±3; everyone has agreed that√
x ≥ 0 whenever x≥ 0, and so this agreement implies that√
x is a function.
The calculus definition of function is certainly in the right spirit, but it has a defect: what is a rule? To ask this question another way, when are two rules the same? For example, consider the functions
f (x )= (x + 1)2 and g(x )= x2+ 2x + 1.
Is f (x )= g(x)? The evaluation procedures are certainly different: for example, f (6)= (6 + 1)2= 72, while g(6)= 62+ 2 · 6 + 1 = 36 + 12 + 1. Since the term rule has not been defined, it is ambiguous, and our question cannot be answered.
Surely the calculus description is inadequate if one cannot decide whether these two functions are equal.
To find a reasonable definition, let us return to examples of what we seek to define. Each of the functions x2,sin x , etc., has a graph, namely, the subset of the plane consisting of all those points of the form (a, f (a)). For example,
the graph of f (x ) = x2is the parabola consisting of all the points of the form (a, a2).
A graph is a concrete thing, and the upcoming formal definition of a function amounts to saying that a function is its graph. The informal calculus definition of a function as a rule remains, but we will have avoided the problem of saying what a rule is. In order to give the definition, we first need an analog of the plane (for we will want to use functions f (x ) whose argument x does not vary over numbers).
Definition. If X and Y are (not necessarily distinct) sets, then their cartesian2 product X× Y is the set of all ordered pairs (x, y), where x ∈ X and y ∈ Y .
The plane is × .
The only thing one needs to know about ordered pairs is that (x , y)= (x0,y0) if and only if x = x0and y= y0 (see Exercise 2.4 on page 101).
Observe that if X and Y are finite sets, say,|X| = m and |Y | = n (we denote the number of elements in a finite set X by|X|), then |X × Y | = mn.
Definition. Let X and Y be (not necessarily distinct) sets. A function f from X to Y , denoted by
f: X → Y,
is a subset f ⊆ X × Y such that, for each a ∈ X, there is a unique b ∈ Y with (a, b)∈ f .
For each a ∈ X, the unique element b ∈ Y for which (a, b) ∈ f is called the value of f at a, and b is denoted by f (a). Thus, f consists of all those points in X× Y of the form (a, f (a)). When f : → , then f is the graph of f (x ).
Example 2.1.
(i) The identity function on a set X , denoted by 1X: X → X, is defined by 1X(x ) = x for every x ∈ X [when X = , the graph of the identity function is the 45◦line through the origin consisting of all those points in the plane of the form (a, a)].
(ii) Constant functions: If y0 ∈ Y , then f (x) = y0 for all x ∈ X (when X=
= Y , then the graph of a constant function is a horizontal line).
2This term honors R. Descartes, one of the founders of analytic geometry.
From now on, we depart from the calculus notation; we denote a function by f and not by f (x ); the latter notation is reserved for the value of f at an element x (there are a few exceptions; we will continue to write the familiar functions, e.g., polynomials, sin x , ex,√
x, log x , as usual). Here are some more words.
If f: X → Y , call X the domain of f , call Y the target (or codomain) of f , and define the image (or range) of f , denoted by im f , to be the subset of Y consisting of all the values of f . When we say that X is the domain of a function
f : X → Y , we mean that f (x) is defined for every x ∈ X. For example, the domain of sin x is , its target is usually , and its image is[−1, 1]. The domain of 1/x is the set of all nonzero reals and its image is also the nonzero reals; the domain of the square root function is the set
≥ = {x ∈ : x ≥ 0} of all nonnegative reals and its image is also ≥.
Definition. Functions f : X → Y and g : X0 → Y0 are equal if X = X0, Y = Y0, and the subsets f ⊆ X × Y and g ⊆ X0× Y0are equal.
A function f: X → Y has three ingredients: its domain X, its target Y , and its graph, and we are saying that two functions are equal if and only if they have the same domains, the same targets, and the same graphs. It is plain that the domain and the graph are essential parts of a function, and some reasons for caring about the target are given in a remark at the end of this section.
Definition. If f: X → Y is a function, and if S is a subset of X, then the restriction of f to S is the function f|S : S → Y defined by ( f |S)(s) = f (s) for all s∈ S.
If S is a subset of a set X , define the inclusion i: S → X to be the function defined by i (s)= s for all s ∈ S.
If S is a proper subset of X , then the inclusion i is not the identity function 1S because its target is X , not S; it is not the identity function 1X because its domain is S, not X . If S is a proper subset of X , then f|S 6= f because they have different domains.
Proposition 2.2. Let f: X → Y and g : X0→ Y0be functions. Then f = g if and only if X = X0, Y = Y0, and f (a)= g(a) for every a ∈ X.
Remark. This proposition resolves the problem raised by the ambiguous term rule. If f , g: → are given by f (x )= (x + 1)2and g(x )= x2+ 2x + 1, then f = g because f (a) = g(a) for every number a.
Proof. Assume that f = g. Functions are subsets of X × Y , and so f = g means that each of f and g is a subset of the other (informally, we are saying
that f and g have the same graph). If a ∈ X and (a, f (a)) ∈ f = g, then (a, f (a)) ∈ g. But there is only one ordered pair in g with first coordinate a, namely, (a, g(a)) [because the definition of function says that g gives a unique value to a]. Therefore, (a, f (a))= (a, g(a)), and equality of ordered pairs gives
f (a)= g(a), as desired.
Conversely, assume that f (a)= g(a) for every a ∈ X. To see that f = g, it suffices to show that f ⊆ g and g ⊆ f . Each element of f has the form (a, f (a)). Since f (a) = g(a), we have (a, f (a)) = (a, g(a)), and hence (a, f (a)) ∈ g. Therefore, f ⊆ g. The reverse inclusion g ⊆ f is proved similarly. •
Let us make the contrapositive explicit: if f, g: X → Y are functions that disagree at even one point, i.e., if there is some a ∈ X with f (a) 6= g(a), then
f 6= g.
We continue to regard a function f as a rule sending x ∈ X to f (x) ∈ Y , but the precise definition is now available whenever we need it, as in Proposition 2.2.
However, to reinforce our wanting to regard functions f: X → Y as dynamic things sending points in X to points in Y , we often write
f : x 7→ y
instead of f (x )= y. For example, we may write f : x 7→ x2instead of f (x )= x2, and we may describe the identity function by x 7→ x for all x.
Example 2.3.
Our definitions allow us to treat a degenerate case. If X is a set, what are the functions X → ? Note first that an element of X× is an ordered pair (x , y) with x ∈ X and y ∈ ; since there is no y ∈ , there are no such ordered pairs, and so X × = . Now a function X → is a subset of X × of a certain type; but X× = , so there is only one subset, namely , and hence at most one function, namely, f = . The definition of function X → says that, for each x ∈ X, there exists a unique y ∈ with (x , y) ∈ f . If X 6= , then there exists x ∈ X for which no such y exists (there are no elements y at all in ), and so f is not a function. Thus, if X 6= , there are no functions from X to . On the other hand, if X = , we claim that f = is a function.
Otherwise, the negation of the statement “ f is a function” would be true: “there exists x ∈ , etc.” We need not go on; since has no elements in it, there is no way to complete the sentence so that it is a true statement. We conclude that f = is a function → , and we declare it to be the identity function 1 .
There is a name for functions whose image is equal to the whole target.
Definition. A function f: X → Y is surjective (or onto) if im f = Y.
Thus, f is surjective if, for each y ∈ Y , there is some x ∈ X (probably depending on y) with y= f (x).
Example 2.4.
(i) Of course, identity functions are surjections.
(ii) The sine function → is not surjective, for its image is[−1, 1] which is a proper subset of its target
.
(iii) The functions x2: → and ex: → have target . Now im x2 consists of the nonnegative reals and im ex consists of the positive reals, so that neither x2nor ex is surjective.
(iv) Let f: → be defined by
f (a)= 6a + 4.
To see whether f is a surjection, we ask whether every b∈ has the form b= f (a) for some a; that is, given b, can one find a so that
6a+ 4 = b?
One can always solve this equation for a, obtaining a= 16(b− 4). There-fore, f is a surjection.
(v) Let f: −3
2
→ be defined by
f (a)= 6a+ 4 2a− 3.
To see whether f is a surjection, we seek a solution a for a given b: can we always solve
6a+ 4 2a− 3 = b?
This leads to the equation a(6− 2b) = −3b − 4, which can be solved for a if 6− 2b 6= 0 [note that (−3b − 4)/(6 − 2b) 6= 3/2]. On the other hand, it suggests that there is no solution when b= 3 and, indeed, there is not: if (6a+ 4)/(2a − 3) = 3, cross multiplying gives the false equation 6a+ 4 = 6a − 9. Thus, 3 /∈ im f , and f is not a surjection.
Instead of saying that the values of a function f are unique, one sometimes says that f is single-valued. For example, if
≥ denotes the set of nonnegative reals, then √ : ≥ → ≥ is a function because we have agreed that√
a≥ 0 for every positive number a. On the other hand, f (a)= ±√
a is not single-valued, and hence it is not a function.
The simplest way to verify whether an alleged function f is single-valued is to phrase uniqueness of values as an implication:
if a= a0, then f (a)= f (a0). wheneverab is in lowest terms, then g would be a function.
The formula f ab
The following definition gives another important property a function may have.
Definition. A function f: X → Y is injective (or one-to-one) if, whenever a and a0 are distinct elements of X , then f (a) 6= f (a0). Equivalently, (the contrapositive states that) f is injective if, for every pair a, a0 ∈ X, we have
f (a)= f (a0)implies a = a0.
The reader should note that being injective is the converse of being single-valued: f is single-valued if a = a0 implies f (a) = f (a0); f is injective if
f (a)= f (a0)implies a= a0.
Most functions are neither injective nor surjective. For example, the squaring function f: → , defined by f (x )= x2, is neither.
Example 2.5.
(i) Identity functions 1X are injective.
(ii) Let f: −3
2
→ be defined by
f (a)= 6a+ 4 2a− 3.
To check whether f is injective, suppose that f (a)= f (b):
6a+ 4
2a− 3 = 6b+ 4 2b− 3. Cross multiplying yields
12ab+ 8b − 18a − 12 = 12ab + 8a − 18b − 12,
which simplifies to 26a = 26b and hence a = b. We conclude that f is injective. (We saw, in Example 2.4(v), that f is not surjective.)
(iii) Consider f : → given by f (x ) = x2− 2x − 3. If we try to check whether f is an injection by looking at the consequences of f (a)= f (b), as in part (ii), we arrive at the equation a2−2a = b2−2b; it is not instantly clear whether this forces a = b. Instead, we seek the roots of f (x), which are 3 and−1. It follows that f is not injective, for f (3) = 0 = f (−1);
thus, there are two distinct numbers having the same value.
Sometimes there is a way of combining two functions to form another func-tion, their composite.
Definition. If f : X → Y and g : Y → Z are functions (the target of f is the domain of g), then their composite, denoted by g◦ f , is the function X → Z given by
g◦ f : x 7→ g( f (x));
that is, first evaluate f on x and then evaluate g on f (x ).
Composition is thus a two-step process: x 7→ f (x) 7→ g( f (x)). For exam-ple, the function h: → , defined by h(x ) = ecos x, is the composite g◦ f , where f (x )= cos x and g(x) = ex. This factorization is plain as soon as one tries to evaluate, say, h(π ); one must first evaluate f (π )= cos π = −1 and then evaluate g( f (π )) = g(−1) = e−1 in order to evaluate h(π ). The chain rule in calculus is a formula for computing the derivative (g◦ f )0in terms of g0and f0:
(g◦ f )0(x )= g0(f (x ))· f0(x ).
If f : → and g: → are functions, then g◦ f : → is defined, but f ◦ g is not defined [for target(g) = 6= = domain( f )]. Even when f : X → Y and g : Y → X, so that both composites g ◦ f and f ◦ g are defined,
these composites need not be equal. For example, define f , g: → by f : n 7→ n2 and g: n 7→ 3n; then g ◦ f : 2 7→ g(4) = 12 and f ◦ g : 2 7→
f (6)= 36. Hence, g ◦ f 6= f ◦ g.
Given a set X , let
(X )= {all functions X → X}.
The composite of two functions in (X ) is always defined, and it is, again, a function in (X ). As we have just seen, composition is not commutative; that is, f◦ g and g ◦ f need not be equal. Let us now show that composition is always associative.
Lemma 2.6. Composition of functions is associative: if f: X → Y, g: Y → Z, and h: Z → W are functions, then
h◦ (g ◦ f ) = (h ◦ g) ◦ f.
Proof. We show that the value of either composite on an element a ∈ X is just w= h(g( f (a))). If x ∈ X, then
h◦ (g ◦ f ) : x 7→ (g ◦ f )(x) = g( f (x)) 7→ h(g( f (x))) = w, and
(h◦ g) ◦ f : x 7→ f (x) 7→ (h ◦ g)( f (x)) = h(g( f (x))) = w.
It follows from Proposition 2.2 that the composites are equal. •
In light of this lemma, we need not write parentheses: the notation h◦ g ◦ f is unambiguous.
The next result implies that the identity function 1Xbehaves for composition in (X ) just as the number one does for multiplication of numbers.
Lemma 2.7. If f : X → Y , then 1Y ◦ f = f = f ◦ 1X. Proof. If x ∈ X, then
1Y ◦ f : x 7→ f (x) 7→ f (x) and
f ◦ 1X: x 7→ x 7→ f (x). •
Are there “reciprocals” in (X ); that is, are there any functions f for which there is g∈ (X ) with f ◦ g = 1X and g◦ f = 1X?
Definition. A function f: X → Y is bijective (or is a one-one correspondence) if it is both injective and surjective.
Example 2.8.
(i) Identity functions are always bijections.
(ii) Let X = {1, 2, 3} and define f : X → X by
f (1)= 2, f (2)= 3, f (3)= 1.
It is easy to see that f is a bijection.
We can draw a picture of a function in the special case when X and Y are finite sets. Let X = {1, 2, 3, 4, 5}, let Y = {a, b, c, d, e}, and define f : X → Y by
f (1)= b; f (2)= e; f (3)= a; f (4)= b; f (5)= c.
1 2 3 4 5
a b c d e
X Y
Figure 2.4 A Function
We see that f is not injective because f (1) = b = f (4), and f is not surjective because there is no x ∈ X with f (x) = d. Can one reverse the arrows to get a function g: Y → X? There are two reasons why one cannot. First, there is no arrow going to d, and so g(d) is not defined. Second, what is g(b)? Is it 1 or 4? The first problem is that the domain of g is not all of Y , and it arises because f is not surjective; the second problem is that g is not single-valued, and it arises because f is not injective (this reflects the fact that being single-valued is the converse of being injective). Therefore, neither problem arises when f is a bijection.
Definition. A function f: X → Y has an inverse if there exists a function g: Y → X with both composites g ◦ f and f ◦ g being identity functions.
We do not say that every function f has an inverse; on the contrary, we have just analyzed the reasons why most functions do not have an inverse. Notice that if an inverse function g does exist, then it “reverses the arrows” in Figure 2.4.
If f (a)= y, then there is an arrow from a to y. Now g ◦ f being the identity says that a = (g ◦ f )(a) = g( f (a)) = g(y); therefore g : y 7→ a, and so the picture of g is obtained from the picture of f by reversing arrows. If f twists something, then its inverse g untwists it.
Lemma 2.9. If f: X → Y and g : Y → X are functions such that g ◦ f = 1X, then f is injective and g is surjective.
Proof. Suppose that f (a) = f (a0); apply g to obtain g( f (a)) = g( f (a0));
that is, a = a0 [because g( f (a)) = a], and so f is injective. If x ∈ X, then x = g( f (x)), so that x ∈ im g; hence g is surjective. •
Lemma 2.10. A function f : X → Y has an inverse g : Y → X if and only if it is a bijection.
Proof. If f has an inverse g, then Lemma 2.9 shows that f is injective and surjective, for both composites g◦ f and f ◦ g are identities.
Assume that f is a bijection. Let y ∈ Y . Since f is surjective, there is some a ∈ X with f (a) = y; since f is injective, this element a is unique.
Defining g(y)= a thus gives a (single-valued) function whose domain is Y [g merely “reverses arrows:” since f (a) = y, there is an arrow from a to y, and the reversed arrow goes from y to a]. It is plain that g is the inverse of f ; that is, f (g(y))= f (a) = y for all y ∈ Y and g( f (a)) = g(y) = a for all a ∈ X. • Notation. The inverse of a bijection f is denoted by f−1 (Exercise 2.8 on page 101 says that a function cannot have two inverses). This is the same notation used for inverse trigonometric functions in calculus; e.g., sin−1x = arcsin x satisfies sin(arcsin(x ))= x and arcsin(sin(x)) = x. Of course, sin−1 does not denote the reciprocal 1/ sin x , which is csc x .
Example 2.11.
Here is an example of two functions f and g whose composite g◦ f is the identity but whose composite f ◦ g is not the identity; thus, f and g are not inverse functions.
Define f , g: → as follows:
f (n)= n + 1;
g(n)=
(0 if n = 0 n− 1 if n ≥ 1.
The composite g◦ f = 1 , for g( f (n)) = g(n + 1) = n (because n + 1 ≥ 1).
On the other hand, f ◦ g 6= 1 because f (g(0)) = f (0) = 1 6= 0.
Example 2.12.
If a is a real number, then multiplication by a is the function µa: → defined by r 7→ ar for all r ∈ . If a 6= 0, then µais a bijection; its inverse function is division by a, namely, δa: → , defined by r 7→ 1ar ; of course, δa = µ1/a. If a= 0, however, then µa= µ0is the constant function µ0: r 7→ 0 for all r ∈ , which has no inverse function because it is not a bijection.
Two strategies are now available to determine whether a given function is a bijection: use the definitions of injective and surjective, or find an inverse. For example, if + denotes the positive real numbers, let us show that the
Two strategies are now available to determine whether a given function is a bijection: use the definitions of injective and surjective, or find an inverse. For example, if + denotes the positive real numbers, let us show that the