7 PROPUESTA DE ENTRENAMIENTO PARA OPOSITORES A BOMBERO
7.5 Control y evaluación
7.5.1 Test realizados y su disposición
Let's dive right in. First, drag a t MysqlInput co mpo nent fro m the palette to yo ur canvas. Set its database co nnectio n to the sakila connection from the repository. For the query, we'll use this code (from lesson 5):
OBSERVE: SELECT
c.customer_id, c.first_name, c.last_name, c.email, a.address, a.address2, a.district,
ci.city, co.country, postal_code,
a.phone,c.active, c.create_date FROM customer c
JOIN address a on (c.address_id = a.address_id) JOIN city ci on (a.city_id = ci.city_id)
JOIN country co on (ci.country_id = co.country_id) WHERE customer_id > 10;
Run yo ur query to make sure yo u typed it in co rrectly. Then click o n the butto n next to Edit Sche m a, and enter these co lumns:
Co lum n T ype Nullable Le ngt h custo mer_id int
first_name String 45
last_name String 45
email String Yes 50
address String 50
address2 String Yes 50
city String 50 co untry String 50 po stal_co de String 10 pho ne String 20 active int create_date Date
Note
If yo ur dimCusto mer table is slightly different, then yo u'll need to change yo ur schema. Fo rexample, if yo ur table might allo w NULLs fo r email, yo u sho uld change the schema to allo w nulls in that co lumn.Yo u might wo nder why we aren't using the Gue ss Sche m a butto n to have TOS figure o ut the schema fo r us. It seems like that wo uld be fast and relatively easy. But in practice, the Gue ss Sche m a o ptio n o nly wo rks well with simple schemas and data types. That's because TOS examines the data that the database returns fro m the query, in o rder to make decisio ns o n data types and lengths, and igno res the underlying data types set in the database tables.
That might be o kay fo r so me queries, but it do esn't wo rk fo r o ur dimCustomer query. In this query, currently the address2 column only has NULL values. TOS can't determine a data type when there's no data. For other co lumns, like postal_code, TOS sees only values like 90210, so TOS guesses that the data type is
Integer. We kno w that o ther parts o f the wo rld have different po stal co de fo rmats ("SW1A 0 AA" is a valid po stal co de in the United Kingdo m), so o ur dimCustomer table uses varchar as its data type, not integer. It's fine to use Gue ss Sche m a as a starting po int, but yo u still need to verify manually that the co lumns TOS picks are o f the co rrect data type, nullability, and length.
With the input o ut o f the way, we are free to mo ve o n to the mapping. Drag a t Map co mpo nent to the canvas, and link the main ro w o f the previo us t MysqlInput co mpo nent to t Map. Just like befo re, add a new o utput, and add a co lumn called run_id (type: integer, expressio n: context.run_id) to the output. Link every input co lumn to the o utput.
The last step fo r dimCustomer is to add a t MysqlSCD component to the canvas. Link the output of t Map to the input o f t MysqlSCD, and allo w TOS to take the schema fro m the input co mpo nent. If TOS do es no t ask whether yo u want to use the input schema, click o n Sync Co lum ns.
Set the co nnectio n o n t MysqlSCD to the data wareho use co nnectio n in the repo sito ry, and specify dimCustomer fo r the table. Once that's do ne, click o n the butto n next to SCD Edit o r.
In this new windo w yo u'll tell t MysqlSCD ho w to handle every co lumn in the data flo w. To set up the
co mpo nent, drag a co lumn fro m the Unuse d sectio n to a different sectio n. We'll start with So urce Ke ys. Our so urce key is a single co lumn: customer_id. Drag that column to the So urce Keys section. When you're do ne, yo ur screen will lo o k like this:
Next, we'll setup o ur surro gate keys. Our surro gate key do es no t exist in the data flo w. Instead, it will be created by auto increment in MySQL. Name the surro gate key customer_key, and name the creation Auto increment. When yo u're finished, that sectio n will lo o k like this:
No w we'll specify o ur Type 0 co lumns. Since o ur dimensio n is o nly included fo r auditing and lo gging, run_id sho uld never be co nsidered part o f the dimensio n. This value changes at each run, so we never want to track changes o n it. Drag run_id to the Type 0 fields section. That section will now look like this:
We are no t particularly interested in tracking changes to o ur custo mers' names. And if a custo mer's
create_date changes, it pro bably means the so urce system had an erro r, and the current reco rd is being fixed, so we do n't want to track changes o n that either. Drag first_name, last_name and create_date to the Type 1 fields sectio n. That sectio n will no w lo o k like this:
Let's check o ut Type 2 changes no w. Type 2 changes require additio nal co nfiguratio n because there are different ways to track histo ry within the same table (if yo u'd like to review type 2 changes, refer back to lesso n 3).
Drag the remaining co lumns fro m Unuse d to the Type 2 fields sectio n. Then we need to tell TOS ho w we are keeping histo ry. In this dimensio n, we are using a start and end date, but we are no t using a versio n number co lumn o r an active flag. Rename the start co lumn st art _dat e , and set its creatio n to J o b st art t im e . Rename the end co lumn e nd_dat e , change its creatio n to Fixe d ye ar value , and set its co mplement to 20 9 9 . After yo u have made these changes, the Type 2 fields sectio n will lo o k like this:
histo ry co lumns.
This dimensio n do esn't have any Type 3 fields, so it's left blank:
Click "OK" to clo se the SCD Co mpo nent edito r, then save yo ur changes. When yo u're do ne, yo ur dim Cust o m e r sub jo b sho uld lo o k like this: