provides clean, operational data to user communities. The creation of infor- mation and analytical insight is entirely dependent on the users.
Judging whether the warehouse is a success is a subjective business. If we judge success on the ability to efficiently collect, integrate, and cleanse corporate data on a predictable basis, then yes, this warehouse is a success. On the other hand, if we look at the cultivation, nurturing, and exploitation of the information the organization as a whole enjoys, then the warehouse is a failure. A data warehouse that acts only as a passive repository pro- vides little or no information value. Consequently, user communities are forced to fend for themselves, causing the creation of information silos.
This chapter presents a complete vision for rolling out an enterprisewide BI architecture. We start with an overview of BI and then move to discus- sions on planning and designing for information content, as opposed to simply providing data to user communities. Discussions are then focused on calculating the value of your BI efforts. We end with defining how IBM addresses the architectural requirements of BI for your organization.
Overview of the BI Organization Architecture
Powerful transaction-oriented information systems are now commonplace in every major industry, effectively leveling the playing field for corpora- tions around the world. To remain competitive, however, now requires analytically oriented systems that can revolutionize a company’s ability to rediscover and utilize information they already own. These analytical sys- tems derive insight from the wealth of data available, delivering informa- tion that’s conclusive, fact-based, and actionable.
Business intelligence can improve corporate performance in any infor- mation-intensive industry. Companies can enhance customer and supplier relationships, improve the profitability of products and services, create worthwhile new offerings, better manage risk, and pare expenses dramat- ically, among many other gains. Through business intelligence your com- pany can finally begin using customer information as a competitive asset with applications such as target marketing, customer profiling, and prod- uct or service usage analysis. Having the right intelligence means having definitive answers to such key questions as:
■■ Which of our customers are most profitable, and how can we expand relationships with them?
■■ Which of our customers provide us profit, or cost us money? ■■ Where do our best customers live in relation to the stores/branches
■■ Which products and services can be cross-sold most effectively, and to whom?
■■ Which marketing campaigns have been most successful and why? ■■ Which sales channels are most effective for which products? ■■ How can we improve our customers’ overall experience?
Most companies have the raw data to answer these questions. Opera- tional systems generate vast quantities of product, customer, and market data from point-of-sale, reservations, customer service, and technical sup- port systems. The challenge is to extract and exploit this information. Many companies take advantage of only a small fraction of their data for strategic analysis. The remaining untapped data, often combined with data from external sources like government reports, trade associations, analysts, the Internet, and purchased information, is a gold mine waiting to be explored, refined, and shaped into informational content for your organi- zation. This knowledge can be applied in a number of ways, ranging from charting overall corporate strategy to communicating personally with vendors, suppliers, and customers through call centers, kiosks, billing statements, the Internet, and other touch points that facilitate genuine, one- to-one marketing on an unprecedented scale.
Today’s business environment dictates that the data warehouse (DW) and related BI solutions evolve beyond the implementation of traditional data structures such as normalized atomic-level data and star/cube farms. What is now needed to remain competitive is a fusion of traditional and advanced technologies in an effort to support a broad analytical landscape, naturally serving up a rich blend of real-time and historical analytics. Finally, the overall environment must improve the knowledge of the enter- prise as a whole, ensuring that actions taken as a result of analysis con- ducted are fed back into the environment for all to benefit.
For example, let’s say you classify your customers into categories of high to low risk. Whether this information is generated by a mining model or other means, it must be put into the warehouse and be made accessible to anyone, using any access tool, such as static reports, spreadsheet pivot tables, or online analytical processing (OLAP). However, currently, much of this type of information remains in the data silos of the individuals or departments who generate the analysis and act upon it, essentially creating information silos. The organization, as a whole, has little or no visibility to the insight. Only by blending this type of informational content into your enterprise warehouse can you eliminate information silos and elevate your warehouse environment and BI effort to a level called the business intelli- gence organization.
There are two major barriers to building a BI organization. First, we have the problem of the organization itself, its corporate culture, its discipline (or lack thereof) to rein in rogue executives, and its dedication to IT as a facilitator of the information asset. Although we cannot help with the polit- ical challenges of an organization, we can help you understand the compo- nents of a BI organization, its architecture, and how IBM technology facilitates its development. The second barrier to overcome is the lack of integrated technology and a conscious approach that addresses the entire BI space as opposed to just a small component. IBM is meeting the chal- lenge of integrating technology. It is your responsibility to provide the con- scious planning.
This architecture must be built with technology chosen for seamless inte- gration, or at the very least, with technology that adheres to open stan- dards. Moreover, your company management must ensure that enterprise business intelligence is implemented according to plan and that you do not allow the development of information silos that result from self-serving agendas, or objectives. That is not to say that the BI environment is not responsive to the individual needs and requirements of user communities; instead, it means that the implementation of those individual needs and requirements is done to the benefit of the entire BI organization.
An overview of the BI organization’s architecture can be found on page 9 in Figure 1.1. The architecture demonstrates a rich blend of technologies and techniques. From the traditional view, the architecture includes the fol- lowing warehouse components:
Atomic layer. This is the foundation, the cornerstone to the entire data warehouse and therefore strategic reporting. Data stored here will preserve historical integrity, data relationships, and include derived metrics, as well as be cleansed, integrated, static, geocoded, and scored using mining models. All subsequent usage of this data and related information is derived from this structure. It is an excel- lent source for data mining and advanced structured query language (SQL) reporting, and it is the wellspring for data to be used in OLAP applications.
Operational data store (ODS) or reporting database. These are data structures specifically designed for tactical reporting. The data stored and reported on from these structures may ultimately be propagated into the warehouse via the staging area, where it could be used for strategic reporting.
Staging area. The first stop for most data destined for the warehouse environment is the staging area. Here data is integrated, cleansed, and transformed into useful content that will be populated in target data warehouse structures, specifically the atomic layer of the warehouse.
Data marts. This part of the architecture represents data structures used specifically for OLAP. The presence of data marts, whether the data is stored in star schemas that superimpose multidimensional data in a relational environment or in proprietary data files used by specific OLAP technology, such as DB2 OLAP Server, is not relevant. The only constraint is that the architecture facilitates the use of multi- dimensional data.
The architecture also incorporates critical technologies and techniques that are distinctively BI-centric, such as:
Spatial analysis. Space is an information windfall for the analyst and is critical to thorough decision making. Space can represent informa- tion about the people who live at a location, as well as information about where that location physically is in relation to the rest of the world. To perform this analysis, you must start by binding your address information to longitude and latitude coordinates. This is referred to as geocoding and must be part of the extraction, transfor- mation, and loading (ETL) process at the atomic layer of your ware- house.
Data mining. Data mining permits our companies to profile cus- tomers, predict sales trends, and enable customer relationship man- agement (CRM), among other BI initiatives. Mining must therefore be integrated with the warehouse data structures and supported by warehouse processes to ensure both effective and efficient use of the technology and related techniques. As shown in the BI architecture, the atomic layer of the warehouse as well as data marts are excellent data sources for mining. Those same structures must also be recipi- ents of mining results to ensure availability to the broadest audience.
Agents. There are various “agents” for examining customer touch points, the company’s operational systems, and the data warehouse itself. These agents may be advanced neural nets trained to spot trends, such as future product demand based on sales promotions, rules-based engines to react to a given set of circumstances, or even simple agents that report exceptions to top executives. These agent processes generally occur in real time and, therefore, they must be tightly coupled with the movement of the data itself.
All these data structures, technologies, and techniques guarantee that you will notcreate a BI organization overnight. This endeavor will be built incrementally—in small steps. Each step is an independent project effort and is referred to as an iterationin your overall warehouse or BI initiative. Iterations can include implementing new technologies, initiating new tech- niques, adding new data structures, loading additional data, or expanding the analysis to your environment. This topic is discussed in greater depth in Chapter 3.
In addition to the traditional warehouse structures and BI-centric tools, there are other aspects of your BI organization for which you must plan, such as:
Customer touch points. As with any modern organization there exist a number of customer touch points in which to influence a positive experience for your customers. There are the traditional channels such as dealers, telephone operators, direct mail, multimedia, and print advertisement, as well as more contemporary channels such as email, and the Web. Data produced at any touch point must be acquired, transported, cleansed, transformed, and then populated to target BI data structures.
Operational databases and user communities. At the opposite end of the customer touch points lies a firm’s application databases and user communities. Existing here are traditional data that must be gathered and blended with data flowing in from the customer touch points in order to create the necessary informational content.
Analysts. The principal beneficiary of the BI environment is the ana- lyst. It is this person who benefits from the timely extraction of oper- ational data, integrated with disparate data sources, enhanced with features such as spatial analysis (geocoding), and presented in BI technology that affords mining, OLAP, advanced SQL reporting, and spatial analysis. The primary interface for the analyst to the reporting environment is the BI portal. However, the analyst is not the only one to benefit from the BI architecture. Executives, broad user communi- ties, and even partners, suppliers, and customers can and should share in the benefits of enterprise BI.
Back-feed loop. By design, the BI architecture is a learning environ- ment. A principle characteristic of the design is to afford the persis- tent data structures to be updated by the BI technology used and the user actions taken. An example is customer scoring. If the marketing
department implements a mining model that scores customers as likely to use a new service, then the marketing department should not be the only group that benefits from that knowledge. Instead, the mining model should be implemented as a natural part of the data flow within the enterprise, and the customer scores should become an integrated part of the warehouse informational content, visible to all users.
IBM’s suite of BI-centric products—including DB2 UDB, DB2 OLAP Server, Intelligent Miner, and the Spatial Extender—encompasses the vast majority of important technology components, defined in Figure 1.1. We use the architecture shown in this figure throughout the book to give us a level of continuity and to demonstrate where each IBM product fits in the overall BI scheme.
Figure 1.1 The BI organization. ACTION ACTION ACTION 3rd- Party Data Sales STAGING AREA Table Table Table Table Table Table Table Table Table
OPERATIONAL DATA STORE
Operations Raw Data
Finance
CUSTOMER
CUSTOMER TOUCH POINTS
META DATA
GEOCODING ATOMIC LEVELNORMALIZED DATA DIMENSIONAL DATADATA MARTS
MARKET FORECAST TREND ANALYSIS BUDGETING DATA CLEANSING DATA INTEGRATION DATA TRANSFORMATION TRAFFIC ANALYSIS CLICKSTREAM ANALYSIS MARKET SEGMENTATION CUSTOMER SCORING CALL DETAIL ANALYSIS
OPERATIONS DATABASES USER COMMUNITIES DATA MINING DATA MINING CUSTOMER AGENTS DW AGENTS AGENT NETWORK
OPERATIONS AGENTS PERCEPTS
PERCEPTS PERCEPTS PERCEPTS PERCEPTS PERCEPTS DECISION MAKERS SPATIAL ANALYSIS Back-Feed Loop Back-Feed Loop Back-Feed Loop
ADVANCED QUERY & REPORTING OLAP DATA MINING $ Vendor WEB Customer or Partner Raw Data CONCEPTUAL NETWORK E-MAIL MULTIMEDIA PRINT WEB Direct Mail In-Store Purchase
Thank you for your patience. INTERNET $ $ $ BI DASHBOARD AND REPORTING PORTAL DASHBOARD User Profile BI DASHBOARD AND CONTROL PANEL DASHBOARD Analyst Profile Back-Feed Loop