Data Mining and Creating Knowledge
In the 1970s companies employed business analysts who used statistical packages like SAS and SPSS to perform trend analyses and cluster analyses on data. As it became possible and affordable to store large amounts of data, managers wanted to access and analyze transaction data like that generated at a retail store cash register. Bar coding and the World Wide Web have also made it possible for companies to collect large amounts of new data.
Database marketing has also benefited from mining data. The information incorporated in the database marketing process is the historical database of previous mailings and the features associated with the (potential) customers, such as age, zip code, their response in the past. Data mining software uses this information to build a model of customer behavior that can be used to predict which customers are most likely to respond to a new catalog. By using this information a marketing manager can target the customers likely to respond (cf., Thearlinghttp://www3.shore.net/~kht/index.htm).
For many years companies had statisticians study company data. When a statistician looks at the data, he or she makes a hypothesis about a relationship, then performs a query on a database and uses statistical techniques to prove or disprove the hypothesis. This has been called the "verification mode" (IBM, 1998). Data mining software works in a "discovery mode." Data mining software looks for patterns. No hypothesis is established before the data is analyzed.
There are two main kinds of models in data mining: predictive and descriptive. Predictive models can be used to forecast explicit values, based on patterns determined from known results. For example, from a database of customers who have already responded to a particular offer, a model can be built that predicts which prospects are likeliest to respond to the same offer. The predictive model is then used in a DSS. Descriptive models describe patterns in existing data, and are generally used to create meaningful subgroups such as demographic clusters. Once a descriptive model is identified it may be used for target marketing or other decision support tasks.