How can data storage differ?

by Daniel J. Power
Editor, DSSResources.COM

Until recently, data was stored for two primary purposes, processing transactions and providing decision support. Database designers and modelers made various assumptions related to the purpose for data storage that involved answers to the following three questions:

Q1. How often should the same data value or "piece of information" be stored?

In a transaction processing environment the assumption has traditionally been to store a transaction once and once only. Decision support or data warehouse data storage is non-volatile and hence the assumption has been that multiple copies of the same data and information should be stored when the duplication improves query performance. Storing multiple copies is a problem if the data must be corrected or changed.

Q2. Who will access the data?

If the data will be accessed and retrieved by sophisticated users, then using Structured Query Language (SQL) is acceptable. If the data will be accessed and retrieved by managers, then use a simple retrieval mechanism.

Q3. What level of performance is required to retrieve data?

If fast, real-time retrieval is required, then a well-tuned relational database is required. If you are asking unplanned, ad hoc questions, then performance matters less than in other data retrieval situations.

These three questions have become less important and the answers have gotten more ambiguous. Data storage has become more complex. Data storage may serve an archival purpose because storage cost have declined and continue to decline. Second, more people need to retrieve data for many diverse reason including someone retrieving and reading Facebook posts, a salesperson checking contact information and purchace history from a smart phone, and a person paying for an item at a retain store using the smart phone payment app.

Perhaps we need to expand the data storage questions we ask and perhaps we need to recognize the answers are not binary or dichotomous, but rather multiple equally good alternative answers or even a range of values. So the best answer to Question 1, may be two (2) in many situations, once in the transaction database and once in the data backup and recovery archive. Perhaps the best answer to Question 2 is all stakeholder. Finally, perhaps Question 3 is now always in real-time, no delays.

Some new questions to think about are Q1: How will privacy be ensured? Q2: How long will the data be stored? Q3: Ultimately, how large might the data store become? and Q4: Is data backup necessary?

Our assumptions for data storage should be regularly revisited. Data storage is no longer an either/or choice of a Relational Database Management System (RDBMS) or a Datawarehouse (DW).

Last update: 2018-07-10 02:43
Author: Daniel Power

Print this record Print this record
Show this as PDF file Show this as PDF file

Please rate this entry:

Average rating: 0 from 5 (0 Votes )

completely useless 1 2 3 4 5 most valuable

You cannot comment on this entry

DSS Home |  About Us |  Contact Us |  Site Index |  Subscribe | What's New
Please Tell Your Friends about DSSResources.COMCopyright © 1995-2015 by D. J. Power (see his home page).
DSSResources.COMsm is maintained by Daniel J. Power. Please contact him at with questions. See disclaimer and privacy statement.


powered by phpMyFAQ 1.5.3