Conflictos de intereses - adicciones vol. 32, nº 2 · 2020 índice

3. In Figure 3.6’s dialog box, enter a table name and type/paste the table defini-tion SQL.

4. Click OK.

a. If the SQL validates without error, the table is created.

b. If there is an error, it displays in the uI (user interface).

3.4 moDeling the Data sourCe

After creating the initial connections, the next step prior to actually creating hierar-chies is to determine whether or not modeling need be applied to the tables or files.

modeling, quite simply, is the creation of joins and the creation of mini-schema for the data source. Studio does not require that you create a mini-schema to build dimension-ality (Figure 3.7).

The only times you must have a mini-schema in place are when you are doing the following:

1. Creating hierarchies across multiple tables (only available for relational sources) 2. Loading relational data from within Studio

3. Creating a hierarchy from a single table/file using parent and child columns Just a brief word about item 2. It may sound strange to think of loading of data with-out using Studio. It is often assumed that because you are using Studio to model a data source and build a prototype you would also leverage Studio to load data. We will touch

Figure 3.6 User-defined table definition.

on this topic later in the chapter; however, in respect to data I often recommend the modeling of dimensionality with Studio, but loading data directly with the Data Prep Editor. to take that sentiment one step farther, I often recommend that you model your dimensionality with Studio, but perform the dimension build processes manually lever-aging the Load rules Studio generates.

At its core, Essbase Studio tool is a graphical Load rule generator whose output need not be a cube. In fact, there is an overt option to simply generate Load rules, which can then be used in standard batch process or leveraged manually to build the dimensional-ity and load the data as you see fit.

If you do end up creating a mini-schema, there are a number of options to consider when selecting and joining tables within the mini-schema. In general, you should only add those tables to the mini-schema that are necessary for the following purposes:

• to derive the required hierarchies (either cross-table or within table)

• to load data from the primary or alternate fact table (assuming data is being loaded via Studio)

Adding nonjoined tables simply clutters the user interface. Furthermore, adding additional (optional) tables to the join process unnecessarily complicates the SQL issued by Studio.

3.4.1 Mini-Schema versus View/Tables

From a prototype perspective, leveraging the mini-schema to handle to cross-table dimension builds and complex data loads is fine. however, for longer-term scalability, it is preferable to build tables or views directly in the relational source and minimize the required joins and cross-table builds within the mini-schema. Placing the tables or views directly in the source allows Studio to go after a single database object per dimension.

This has advantages from two perspectives. First, the SQL issued for dimension builds is far less complex. Second, the relational DBAs (database administrators) can write more efficient SQL to define the table or view. While you have the option of writing

Figure 3.7 Sample mini-schema.

custom SQL for data loads, Studio does not allow custom SQL for dimensional/hierar-chical objects. As such, the SQL optimization opportunities are limited.

3.4.1.1 Adding Joins and Adding Joins by Inspection When working with mini-schemas, you have the option of either creating joins manually from table to table or add views by inspection. In general, regardless of whether you are prototyping with Studio or lever-aging the tool for a production deployment, adding joins by inspection is not a recom-mended practice. Please note that Studio adds joins automatically when adding tables to a mini-schema. tables with key fields are joined where the column names match.

When adding joins by inspection, Studio tends to add both too many joins and joins in a less optimal direction. For example, Figure 3.8 shows optimal joins between dimen-sion tables and a fact table.

notice that there is a single join between the dimension tables and the fact table in this image. Further, the joins go from the fact table out to the dimension tables. Creating your joins from the fact table out allows Studio to generate more optimal SQL. If you must create a dimension from multiple tables (in more of a snowflake fashion), the joins are more optimal moving from the fact table out. The join should move from the fact table out to the most immediately connected dimension table and then out to the sec-ondary dimension table (as shown in Figure 3.9).

Figure 3.10 illustrates the same relational source after adding joins by inspection.

notice in this instance that the Studio engine pattern-matched field names and cre-ated all possible joins to ensure you can create dimensionality across tables and link everything back to the fact table. In this simple example, there are only a few additional joins added. however, the more complex the underlying structure, the more complex the join population becomes. While this may result in the same dimensionality and overall end product, the SQL generated will be significantly more complex and slower.

If you run into issues or inconsistent results during the dimension build or data load stages, having the extended set of joins makes this much more difficult to troubleshoot.

Figure 3.8 Fact table star schema joins.

Proper prototyping (not unlike proper production work) is about getting the job done as simply as possible. Simplicity is elegance.

to add Joins by Inspection:

1. open a mini-schema diagram.

2. right-click on the open space in the mini-schema area.

Figure 3.9 Fact table snowflake joins.

Figure 3.10 Snowflake joins through inspection.

3. Select Add joins by inspection …

In document adicciones vol. 32, nº 2 · 2020 índice · index (página 53-56)