from DSSResources.com

*********************************************************** 

                        DSS News
                  D. J. Power, Editor
          December 17, 2006 -- Vol. 7, No. 26

    A Free Bi-Weekly Publication of DSSResources.COM 
            approximately 1,900 Subscribers 

************************************************************

  Check the interview with Henry Morris "Decision support
     and analytic applications" at DSSResources.COM

************************************************************ 

Featured:

* Ask Dan: How can organizations keep data-driven DSS and 
data warehouses current with changes in source systems?
* DSS News Releases 

************************************************************ 

            Rule Developer Needed! Check
http://decisionautomation.com/ads/requestinformationform.html

************************************************************

Ask Dan!

How can organizations keep data-driven DSS and data warehouses
current with changes in source systems?

by Dan Power
Editor, DSSResources.com

In my Ask Dan column of October 22, 2006 (Vol. 7, No. 22), I answered
some of the challenging questions that Aileen MacKay had sent me. In
this Ask Dan!, I'll focus only on her system change questions. She
asks "What is the process companies are using to keep their Data
Warehouse up-to-date? Once the source system is changed, what is the
process for moving these changes into the Data Warehouse and DSS?
Once changes are made to the source system, how long does it usually
take to make the changes in the data warehouse and DSS? How long
should this process take if it is run well? Do you have examples of
state- of-the-art-processes that work really well where managers are
willing to share their process from the source system to DW to DSS?
Specific to the health care insurance industry or medicaid? Thank
you."

In most data-driven decision support environments, multiple internal
and external computerized systems provide source data that must be
extracted, transformed and then loaded (ETL) into a data
mart/warehouse (DSS data store). Routinely data is moved from source
systems to a decision support data store. This process can be and
usually is automated and routinized. But as businesses and
organizations change processes, products and services the data models
for operational systems change and because of changes in these source
systems, the design of data warehouses and data marts used by
data-driven DSS must also change. A design challenge occurs whenever
it becomes necessary to make any changes to the data model in the
source system. The data model defines what is stored in the source
system, how it is stored, and how items in the database are related
to each other. This source system data model impacts the data model
in the decision support data store.

Let's quickly look at some of the changes that can occur in
operational system data models. Products or services may be added,
product codes may change, fields may be added or removed, data types
may change, and data values may be altered. The process of data
transformation can deal with some of these changes, like simple
changes in product identifiers. As more significant changes occur in
source systems more changes must be made in the DSS data store. New
fields or tables may be needed. In some cases the DSS data store may
need to be transformed, manipulated or even radically redesigned.

Waymire and Sawtell (2006) note a data transformation can be one of a
number of "operations applied to your data before it is stored in the
destination. This can include the calculation of new values, the
concatenation of values, breaking of values like a name field into a
first name and last name field, as well as any type of data cleaning
you want to enforce." In general, data transformation involves
processes for selecting, targeting, converting, and mapping data so
that it may be used in a DSS data store. Data may be transformed
using predefined rules or specific formulas or techniques.

When change is contemplated in an operational source system, analysts
also need to examine the ramifications for systems using the data from
the operational system. Part of the analysis and design should involve
mapping the data elements between the data models. IDEALLY
implementing changes in a source system should coincide with
implementing changes in the DSS data store and the data-driven DSS. A
specific data-driven DSS may need to be changed to display new data
or query the revised data model.

Synchronizing changes between operational and external source systems
and DSS data stores insures the quality of the data for decision
support. Examining the mapping between the original and the new data
model helps insure the DSS data is consistent, timely, well
understood, and satisfies the needs of managers. The danger is that
the two models will not be consistent and that users will be
dissatisfied with the validity of the DSS data and the information
derived from that data.

In general, we must deal with both simple and complex
transformations. In some instances we can consolidate data. For
example, if we have two similar products and we eliminate one and
create a new shopkeeping unit (SKU) for the other, we may want to
consolidate the historical sales data for the two products. A more
difficult transformation is required if we further differentiate a
product or service. For example, we may have one academic major in a
department and we replace it with two more differentiated majors with
new codes. In this case major #23 may become majors #230 and #231. In
this situation, we may allocate historical enrollment data between
the two new majors. We may use an arbitrary allocation (fixed %) or a
more complex rule-based allocation. The easiest transformation
involves an addition or a replacement. For example, adding a new SKU
as a data value or substituting a new SKU for an old SKU. Time lags
in adjusting the operational data model and the DSS data store can
create many unnecessary problems for managers and IT staff in
reconciling results from diverse systems. 

Ralph Kimball identifies some "graceful modifications to existing
fact and dimension tables". These include: new dimensional
attributes, new types of measured facts, new dimensions, a dimension
becomes specific or granular, an explicit hierarchy is added relating
two dimensions, new dimensions are added from a new data source. Sadly
many changes that result from changes in source systems are complex,
messy and even awkward to make. In some situations, it seems easier
to "start over" and not try to bring historical data forward to the
revised data model and resulting DSS data store. This "blank slate"
approach should be a last resort decision.

Part of the administration of a DW/DSS data store should involve
governance of the data model and those staff members who are
maintaining the integrity of the decision support system should be
monitoring potential changes in source systems. Also, a process for
change management should be established as part of the ongoing
decision support/BI administration. 

Data-driven DSS fail when source data is missing, inconsistent, or
erroneous. Current, accurate data models are essential. Without them,
data-driven decisions cannot be made with any confidence. At TDWI
(tdwi.org), a research article on data transformation (1999) suggests
both the magnitude of the problem and one technology solution: "as we
embrace a greater number of sources and targets, we centralize all
data transformation for the entire enterprise so that, effectively,
the tool becomes the cornerstone of the corporate decision-making
apparatus. This means a standard data transformation function, like
data tracing, can be applied across the entire enterprise. This means
aggregations and data analysis apply to all data, not just what is
stored in a recognizable database. In all cases, data transformation
has one goal: to guarantee accurate, clean information throughout the
enterprise. To prevent garbage in, garbage out."

The data warehouse/data-driven DSS team must continually monitor what
data should go into the data store and where those particular pieces
of information can be found. Some of the data will be internal to an
organization and hence change is under management control. Source
data extraction, selection, and transformation processes must be
continually evaluated. According to a data warehousing introduction
in the Minnesota State Archives, "Source data analysis and the
efficient and accurate movement of source data into the warehouse
environment are critical to the success of a data warehouse project.
Good metadata is essential to the effective operation of a data
warehouse and it is used in data acquisition/collection, data
transformation, and data access. Acquisition metadata maps the
translation of information from the operational system to the
analytical system. This includes an extract history describing data
origins, updates, algorithms used to summarize data, and frequency of
extractions from operational systems. Transformation metadata includes
a history of data transformations, changes in names, and other
physical characteristics." Part of any change process is updating,
revising and expanding the metadata for the DSS data store. The
metadata helps users understand how the change in a source system
impacted the historical record that is stored in the DSS data
store.

Let's finish Aileen's questions. How long should this process take if
it is run well? In general, updating the DSS data store should be less
time consuming than making changes in source systems. The length of
the change process is a function of the magnitude and complexity of
the change in the source system. At some point the change may become
so great that the historical record is no longer relevant to the
"new" data model. Do you have examples of state-of-the-art-processes
that work really well? NO, perhaps some readers or vendors will
provide a case study for DSSResources.com. For additional
information, I suggest reading some of the other Ask Dan questions on
data warehouses and data-driven DSS and then check and search the
following web sites:

1. Business Intelligence Network 
http://www.b-eye-network.com/home/

B-eye-network delivers industry-based content hosted by domain
experts and industry leaders. The Business Intelligence Network
includes horizontal technology coverage from the most respected
thought leaders in Business Intelligence, Business Performance
Management, Data Warehousing and Data Quality. 

2. Data Warehousing Information Center 
http://www.dwinfocenter.org/ 

This site aims to help visitors learn about data warehousing and
decision support systems by providing links to vendors of tools for
end-user data retrieval and analysis, tools for building and
maintaining system infrastructure, and tools directed at specific
industries and functions.

3. The Data Warehousing Institute
http://www.tdwi.org 

The Data Warehousing Institute (TDWI) is a member-based organization
whose goal is to educate decision-makers and information
professionals on data warehousing strategies and technologies. This
site serves as a clearinghouse for case studies, white papers, and data
warehousing events and conferences worldwide. Membership includes a
subscription to the quarterly Business Intelligence Journal. 

4. DM Review 
http://www.dmreview.com/ 

The DM Review web site offers content from DM Review magazine as well
as online-only articles, resource lists, discussion forums, issue
archives, etc. DM Review focuses on data warehousing and related
topics such as metadata management. 

5. Inmon Associates, Inc. 
http://www.inmongif.com 

This is the official web site produced by Bill Inmon, one of the
leaders in data warehousing. This web site focuses on Inmon's
framework for government information systems, the Government
Information Factory (GIF).

6. Ralph Kimball Associates 
http://www.rkimball.com/ 

Ralph Kimball is a leading proponent of the dimensional approach to
designing large data warehouses. He teaches data warehousing design
skills and helps selected clients with specific data warehouse
designs. 

As always your comments, suggestions, examples, etc. are appreciated.

References

Kimball, R., "Design Tip #29 Graceful Modifications To Existing Fact
and Dimension Tables," 2001, 
URL http://www.rkimball.com/html/designtipsPDF/DesignTips2001/
KimballDT29Graceful.pdf .

Minnesota State Archives, "Data Warehouse Description," URL
http://www.mnhs.org/preserve/records/dwintro.html.

Power, D., Does data modeling differ for a Data-Driven DSS? DSS News,
Vol. 2, No. 17, August 12, 2001.

Power, D., Is a Data Warehouse a DSS? What is a star schema? How does
a snowflake schema differ from a star schema? DSS News, Vol. 3, No. 4,
February 17, 2002.

Power, D., What is ETL software and how is it related to DSS? DSS
News, Vol. 3, No. 16, August 4, 2002.

Power, D., What do I need to know about Data Warehousing/OLAP? DSS
News, Vol. 4, No. 5, March 02, 2003.

Power, D.J., "What is the process for designing decision support
software?" DSS News, Vol. 7, No. 17, August 13, 2006.

Power, D.J., "What staffing is needed for DW/BI/DSS?" DSS News, Vol.
7, No. 22, October 22, 2006.

Schwartz, J., "Complex Data Transformation," Line 56, Tuesday, March
14, 2006, URL
http://www.line56.com/articles/default.asp?ArticleID=7412&pg=3&topicID=4 .

TDWI, "Beyond the Data Mart: A New Look at Data Transformation," What
Works: Volume 8, November 1999,
URL http://www.tdwi.org/research/display.aspx?ID=5385 .

Waymire, Richard and Rick Sawtell, Sams Teach Yourself Microsoft SQL
Server 7 in 21 Days, 2006, URL URL http://www.samspublishing.com .

************************************************************

DSS News Releases - December 3 to December 17, 2006
Read them at DSSResources.COM and search the DSS News Archive

12/15/2006 Sun Microsystems helps retailers manage busy holiday
season with secure SOA technology solutions. 

12/14/2006 MapInfo® Location Intelligence keeps James River
Insurance one step ahead of storms.

12/13/2006 Teradata Warehouse Miner 5.0 to optimize advanced
analytics.

12/13/2006 Oracle delivers industry's broadest selection of
integrated, embedded databases.

12/13/2006 Oracle ranked #1 in retail by leading industry magazine.

12/13/2006 Microsoft Robotics Studio now available to provide common
development platform.

12/12/2006 BearingPoint 'goes to bat' for Sport Chalet to improve
business processes in over 40 retail stores.

12/12/2006 Applix secures new TM1 customers with 64-bit technology.

12/12/2006 Intalio to release version of its business process
management software under open source license.

12/11/2006 EDS' eight tech trends: What CXOs need to know for a happy
IT in 2007.

12/11/2006 Appian Enterprise BPM Suite selected by Archstone-Smith to
automate mission critical business processes.

12/11/2006 Polycom announces IP Multimedia subsystem strategy.

12/11/2006 The Principal Financial Group introduces new tools, making
informed healthcare choices easier.

12/11/2006 SAS BI® to bolster economic development.

12/11/2006 University of Pittsburgh Health Sciences group manages
research and laboratory space allocations using JReport.

12/07/2006 Naval Research Laboratory selects SGI Systems for large
data retrieval capability technology demonstration.

12/06/2006 Website's lofty goal: data dashboard for a democratic
society.

12/06/2006 Fair Isaac's new ScoreNet Borrower Portal meets critical
need for construction lending industry.

12/06/2006 SAS® Learning edition brings power of data analytics
to wider business, academic audiences.

12/06/2006 Oracle announces general availability of Oracle(R)
Enterprise Single Sign-On Suite.

12/06/2006 Cisco selects India as site for the Cisco Globalization
Center.

12/05/2006 Cognos positioned in 'Leaders' Quadrant in 2006 analyst
report on Corporate Performance Management Suites.

12/05/2006 First segment of new Internet2 network goes live, NYSERNet
becomes first connector.

12/05/2006 ProModel Corporation’s simulation software used by
the 2006 RIST prize winning Center for Army Analysis Army Forces
Generation Modeling Simulation study team.

12/05/2006 Plan4Demand introduces Demand Curve Analyzer(TM) supply
chain tool.

12/05/2006 ILOG JViews creates enhanced displays for supervision and
monitoring applications.

12/05/2006 Integrated performance management with PerformancePoint
Server 2007.

12/04/2006 Stottler Henke launches SimVentive Toolkit for building
"serious games".

12/04/2006 Launch of Thomson Clinical Xpert delivers essential
medical information to physicians via PDAs.

12/04/2006 'Laptop lag' constrains mobile professionals.

12/04/2006 Hyperion enhances global partner program to meet demand
for industry-specific BPM solutions.

************************************************************ 
 
    Please tell your DSS friends about DSSResources.COM

************************************************************

DSS News is copyrighted (c) 2006 by D. J. Power. Please send your 
questions to daniel.power@dssresources.com

DSS Home |  About Us |  Contact Us |  Site Index |  Subscribe | What's New
Please Tell 
Your Friends about DSSResources.COM Copyright © 1995-2021 by D. J. Power (see his home page). DSSResources.COMsm was maintained by Daniel J. Power. See disclaimer and privacy statement.