The objective of data warehousing is to analyze data from diverse sources to support decision making. To achieve this goal, we face two challenges:
- Poor system performance. A data warehouse usually contains a large volume of data. It is not an easy job to retrieve data quickly from the data warehouse for analysis purposes. For this reason, the data warehouse design uses a special technique called a star schema.
- Difficulties in extracting, transferring, transforming, and loading (ETTL) data from diverse sources into a data warehouse. Data must be cleansed before being used. ETTL has been frequently cited as being responsible for the failures of many data warehousing projects. You would feel the pain if you had ever tried to analyze SAP R/3 data without using SAP BW.
SAP R/3 is an ERP (Enterprise Resources Planning) system that most large companies in the world use to manage their business transactions. Before the introduction of SAP BW in 1997, ETTL of SAP R/3 data into a data warehouse seemed an unthinkable task. This macro-environment explained the urgency with which SAP R/3 customers sought a data warehousing solution. The result is SAP BW from SAP, the developer of SAP R/3.
In this chapter we will introduce the basic concept of data warehousing. We will also discuss what SAP BW (Business Information Warehouse) is, explain why we need it, examine its architecture, and define Business Content.
First, we use sales analysis as an example to introduce the basic concept of data warehousing.
1.1. Sales Analysis—A Business Scenario
Suppose that you are a sales manager, who is responsible for planning and implementing sales strategy. Your tasks include the following:
- Monitoring and forecasting sales demands and pricing trends
- Managing sales objectives and coordinating the sales force and distributors
- Reviewing the sales activities of each representative, office, and region
Suppose also that you have the data in Tables 1.1 through 1.3 available about your firm's materials, customers, and sales organization.
Material Number | Material Name | Material Description |
---|---|---|
MAT001 | TEA | Ice tea |
MAT002 | COFFEE | Hot coffee |
MAT003 | COOKIE | Fortune cookie |
MAT004 | DESK | Computer desk |
MAT005 | TABLE | Dining table |
MAT006 | CHAIR | Leather chair |
MAT007 | BENCH | Wood bench |
MAT008 | PEN | Black pen |
MAT009 | PAPER | White paper |
MAT010 | CORN | America corn |
MAT011 | RICE | Asia rice |
MAT012 | APPLE | New York apple |
MAT013 | GRAPEFRUIT | Florida grapefruit |
MAT014 | PEACH | Washington peach |
MAT015 | ORANGE | California orange |
Customer ID | Customer Name | Customer Address |
---|---|---|
CUST001 | Reliable Transportation Company | 1 Transport Drive, Atlanta, GA 23002 |
CUST002 | Finance One Corp | 2 Finance Avenue, New York, NY, 10001 |
CUST003 | Cool Book Publishers | 3 Book Street, Boston, MA 02110 |
CUST004 | However Forever Energy, Inc. | 4 Energy Park, Houston, TX 35004 |
CUST005 | Easy Computing Company | 5 Computer Way, Dallas, TX 36543 |
CUST006 | United Suppliers, Inc. | 6 Suppliers Street, Chicago, IL 61114 |
CUST007 | Mobile Communications, Inc. | 7 Electronics District, Chicago, IL 62643 |
CUST008 | Sports Motor Company | 8 Motor Drive, Detroit, MI 55953 |
CUST009 | Swan Stores | 9 Riverside Road, Denver, CO 45692 |
CUST010 | Hollywood Studio | 10 Media Drive, Los Angeles, CA 78543 |
CUST011 | One Source Technologies, Inc. | 11 Technology Way, San Francisco, CA 73285 |
CUST012 | Airspace Industries, Inc. | 12 Air Lane, Seattle, WA 83476 |
Sales Region | Sales Office | Sales Representative | Sales Representative ID |
---|---|---|---|
EAST | ATLANTA | John | SREP01 |
NEW YORK | Steve | SREP02 | |
Mary | SREP03 | ||
MIDWEST | DALLAS | Michael | SREP04 |
Lisa | SREP05 | ||
CHICAGO | Kevin | SREP06 | |
Chris | SREP07 | ||
WEST | DENVER[*] | Sam | SREP08 |
LOS ANGELES | Eugene | SREP09 | |
SEATTLE | Mark | SREP10 |
[*] Prior to January 1, 2000, the Denver office was in the Midwest region.
You also have three years of sales data, as shown in Table 1.4.
Customer ID | Sales Representative ID | Material Number | Per Unit Sales Price | Unit of Measure | Quantity Sold | Transaction Date |
---|---|---|---|---|---|---|
CUST001 | SREP01 | MAT001 | 2 | Case | 1 | 19980304 |
CUST002 | SREP02 | MAT002 | 2 | Case | 2 | 19990526 |
CUST002 | SREP02 | MAT003 | 5 | Case | 3 | 19990730 |
CUST003 | SREP03 | MAT003 | 5 | Case | 4 | 20000101 |
CUST004 | SREP04 | MAT004 | 50 | Each | 5 | 19991023 |
CUST004 | SREP04 | MAT005 | 100 | Each | 6 | 19980904 |
CUST004 | SREP04 | MAT005 | 100 | Each | 7 | 19980529 |
CUST005 | SREP05 | MAT006 | 200 | Each | 8 | 19991108 |
CUST006 | SREP06 | MAT007 | 20 | Each | 9 | 20000408 |
CUST007 | SREP07 | MAT008 | 3 | Dozen | 10 | 20000901 |
CUST007 | SREP07 | MAT008 | 3 | Dozen | 1 | 19990424 |
CUST008 | SREP08 | MAT008 | 3 | Dozen | 2 | 19980328 |
CUST008 | SREP08 | MAT009 | 2 | Case | 3 | 19980203 |
CUST008 | SREP08 | MAT010 | 1 | U.S. pound | 4 | 19991104 |
CUST009 | SREP09 | MAT011 | 1.5 | U.S. pound | 5 | 20000407 |
CUST010 | SREP10 | MAT011 | 1.5 | U.S. pound | 6 | 20000701 |
CUST010 | SREP10 | MAT011 | 1.5 | U.S. pound | 7 | 19990924 |
CUST010 | SREP10 | MAT012 | 2 | U.S. pound | 8 | 19991224 |
CUST010 | SREP10 | MAT013 | 3 | Case | 9 | 20000308 |
CUST011 | SREP10 | MAT014 | 1 | U.S. pound | 10 | 19980627 |
CUST012 | SREP11 | MAT014 | 2 | U.S. pound | 1 | 19991209 |
CUST012 | SREP11 | MAT015 | 3 | Case | 2 | 19980221 |
CUST012 | SREP11 | MAT015 | 2 | Case | 3 | 20000705 |
CUST012 | SREP11 | MAT015 | 3.5 | Case | 4 | 20001225 |
The data in these tables represent a simplified business scenario. In the real world, you might have years of data and millions of records.
To succeed in the face of fierce market competition, you need to have a complete and up-to-date picture of your business and your business environment. The challenge lies in making the best use of data in decision support. In decision support, you need to perform many kinds of analysis.
This type of online analytical processing (OLAP) consumes a lot of computer resources because of the size of data. It cannot be carried out on anonline transaction processing (OLTP) system, such as a sales management system. Instead, we need a dedicated system, which is the data warehouse.