************************************************************ DSS News D. J. Power, Editor September 29, 2002 -- Vol. 3, No. 20 A Bi-Weekly Publication of DSSResources.COM ************************************************************ Check the Applebee's case featuring Teradata ************************************************************ Featured: * DSS Wisdom * Ask Dan! - What is the potential size of the NSEERS database? * What's New at DSSResources.COM * DSS News Releases ************************************************************ DSS News is sent to more than 875 subscribers from 50 countries. Please forward DSS News to people interested in Decision Support Systems and suggest they subscribe. ************************************************************ DSS Wisdom Newell and Simon (1972) argued human problem solving can be understood "by describing the task environment in which it takes place; the space the problem solver uses to represent the environment, the task, and the knowledge about it that he gradually accumulates; and the program the problem solver assembles for approaching the task (p. 868)." from Newell, A. and H. A. Simon, Human Problem Solving, Englewood Cliffs, NJ: Prentice-Hall, Inc., 1972. ************************************************************ Enhance model-driven DSS with Crystal Ball simulation software. Download a FREE evaluation at www.crystalball.com/dss ************************************************************ Ask Dan! by Daniel J. Power What is the potential size of the NSEERS database? The U.S. Immigration and Naturalization Services (INS) is creating a number of very large databases to support a variety of operations and processes. In addition to the NSEERS transaction database discussed two weeks ago (DSS News, Vol. 3, No. 19), INS is developing the Student and Exchange Visitor Information System (SEVIS). SEVIS is an Internet-based system that will be accessed at U.S. Ports of Entry and by more than 1900 schools and colleges. Biometric border crossing cards will also be required of Mexican border crossers as of October 1, 2002. These projects are massive in scope and it is not clear how data will be shared between systems. This Ask Dan! continues the discussion. How do we "size" these databases? Is a data warehouse needed? What platform and software is needed? How will this data collection effort support decision making at INS? Some of these questions were addressed in my prior column, but Marc Demarest, former Chairman and CEO DecisionPoint Applications and current President of Noumenal (http://www.noumenal.com/marc/), offered his analysis and insights and I accepted. Marc began with my assumptions for the NSEERS transaction processing system -- 35 million visitors per year, 45 KB for a fingerprint, 10 KB for a photo and 5 KB for alphanumeric string data per visitor. Marc writes "I think we need to double or triple your alphanumeric string count. For US citizens and green card holders, the 5K is probably right, but have you seen the forms foreign nationals have to fill out? They're huge, and, after all, this is John Ashcroft we're talking about, so, by the time we add in the foreign national's 'home information' and all of the data on where the foreign national will be traveling in the US and what they will be doing and who they will be seeing, I'd bet the foreign national alpha information tops 20K easily. So let's say 12K as an average for alpha data." Second "I didn't see a discussion of what would be captured on exit, but something will surely be captured, yes? That would be the easiest way to catch visa overstays. And if Ashcroft is worried about people masquerading as other people, he'll capture about as much on the way out as he did on the way in. Let's say a photo, a fingerprint, and 5K on the way out." "Now, how would such an application work? The 'decision support' is going to be largely automated, I'd imagine. I'd assume this application is going to (a) compare photographs, (b) compare fingerprints, (c) analyze the alpha data provided and then (d) weight the outcomes and (e) make a recommendation to the official at point-of-capture based on that data. That is the system model (at least I hope it is). Now, they may also do other things with the data, including uploading it into other (INS, NSA, DCA) systems for other kinds of analysis, but this is a closed loop capture-and-analyze system, I'd bet, not two systems: (1) a transaction processing app and (2) a DSS app that is loaded from the TP app according to a schedule." "Since the data in the database is analyzed programmatically and not by a person, it doesn't have to be inherently legible at the schema level, so we're probably not talking a 'star' style schema -- we're talking some kind of normal form schema. Raw data loaded into any schema creates a loaded set size larger than the raw data -- because of DBMS storage mechanisms, indexing overhead, etc. They'll have to be using a conventional RDBMS because this is a INSERT-AND-QUERY system: Teradata- and nCube-style DBMSs and OLAP engines won't take the INSERTs fast enough or elegantly enough. For a star implemented in a conventional RDBMS, one usually sees a 2.5X growth in the raw load set size once it's loaded, and I think it'd be close to the same growth factor in this case: maybe 2.8X. We're also not talking about needing to maintain a lot of history in this system -- the system-of-record for pictures and fingerprints will be elsewhere, because that system will be used for multiple purposes, and data will be migrated out of this system into the real 'data warehouse' for the Department of Homeland Security as soon as an individual alien's entry-exit loop is closed, so I'd bet there will never be more than the equivalent of 1 year's worth of data in the system. In other words, this system won't be extracted INTO; it will be extracted FROM." As an aside Marc noted "The most difficult technical bit for the system will be indexing strategy: the more indexes they add to cut query time, the longer insert time will take. The fewer the indexes, the longer the complex set of queries they are going to have to run will take." Based on his assumptions and analysis, Marc calculated NSEERS is "about a 12 TB system ... easily within the range of Oracle running on a nice cluster of Sun or IBM SMP/NUMA boxes." Marc concludes "You're right, however, that in the final analysis it will be hard to implement." Thanks Marc for letting me quote so extensively from your analysis. A 12 Terabyte database is huge and the more I reflect on the INS projects the more I can see the databases expanding in size. Perhaps these two Ask Dan! columns will stimulate more thinking and discussion about the important decision support issues associated with monitoring visitors to the United States. These new INS systems are mission critical and the projects provide us the opportunity to think innovatively about providing decision support from very large databases. As always your comments and questions are welcomed. If you want a challenge, reflect on how you would would support decision making at INS. If you teach DSS or database, try asking your students the questions raised in DSS News, Vol. 3, Nos. 19 and 20. References Demarest, Marc, Email message, Monday, September 16, 2002 at 09:38:31. Power, D. J., "Is it feasible to track all visitors to the United States and then build a Data-driven DSS?" DSS News, Vol. 3, No. 19, September 15, 2002. ************************************************************ Visit DSS News Sponsors - Crystalball.com and Teradata.com ************************************************************ What's New at DSSResources.COM 09/22/2002 Added materials to Power, D. J. "A Brief History of Decision Support Systems", saved as version 2.1, September 2002, URL DSSResources.COM/history/dsshistory.html. 09/19/2002 Posted case by Teradata Staff, "Understanding customers' preferences at Applebee's International", Teradata, a division of NCR Corporation, 2002, URL DSSResources.COM/cases/. ************************************************************ Get information about Dan Power's book, Decision Support Systems: Concepts and Resources for Managers, at http://www.dssresources.com/dssbookstore/power02.html . ************************************************************ DSS News Releases - September 15 to September 27, 2002 Complete news releases can be found at DSSResources.COM. 09/27/2002 Call for Papers: ICEIS 2003 - 5th International Conference on Enterprise Information Systems, Angers, France 23-26 April, 2003. Paper deadline October 15, 2002. 09/26/2002 Microsoft unveils the Center for Information Work. 09/26/2002 Q&A: What is the Microsoft Center for Information Work? 09/26/2002 Stellent unveils vision for the future of content management. 09/26/2002 Harrah's selects TIBCO for business integration platform. 09/26/2002 Application outsourcing the most efficient and cost effective method of software implementation, IDC system dynamic models prove. 09/25/2002 New Network Computing study finds third party remote access providers reduce management burdens, save money. 09/24/2002 Schwan's selects Intermec handheld computers and mobile printers for nationwide route sales. 09/24/2002 The emergence of the Internet "Mainframe" -- the WebFrame -- will drive infrastructure growth says NetsEdge Research Group. 09/24/2002 Ford selects SGI Reality Center technology for visualization and design optimization. 09/24/2002 Sun Microsystems is honored with the Helen Keller Achievement Award for its leadership in accessibility advancements. 09/23/2000 Leading analyst research finds Business Objects number one business intelligence tools vendor in Western Europe. 09/23/2000 Jones & Stokes uses eRoom hosted enterprise service to manage environmental planning projects. 09/23/2000 Sybase introduces first comprehensive healthcare integration suite built on open standards. 09/23/2000 Teradata profitability analytics bolsters the bottom line for mobile communications companies. 09/23/2000 Netezza unleashes tera-scale data appliance for Business Intelligence. 09/23/2000 Teradata signs worldwide reseller agreement with Informatica; will empower customers with integrated decision-making by linking operational and analytic capabilities. 09/23/2002 GeoSpatial World 2003 enhances exhibitor opportunities to reach GIS, IT, and mapping decision makers. 09/23/2002 ProClarity Corp. fastest growing Business Intelligence vendor named to Software Magazine's 20th Annual Software 500. 09/23/2002 China's largest coal company deploys Datastream enterprise asset management solution. 09/23/2002 Nortel Networks introduces SSL-based secure extranets for enterprise customers; enables secure connectivity for remote users equipped with web browsers. 09/23/2002 Microsoft delivers new migration and coexistence tools for Lotus Notes applications. 09/23/2002 Wrigley selects SAP as global business systems platform. 09/23/2002 International Biometric Group releases Biometric Market Report 2003-2007. 09/23/2002 SAS(R) solution adapters for SAP reduce time to Business Intelligence. 09/20/2002 Call for Participation: 2003 Information Resources Management Association (IRMA) International Conference. Submission Deadline: October 4, 2002. 09/20/2002 Extreme Networks demos production 10 gigabit ethernet infrastructure for next generation network services at Sun Conference. 09/20/2002 Optiant customer Imation wins Start Magazine's Technology and Business Award. 09/19/2002 CIOs report on Information Technology's hottest jobs in semi-annual Robert Half survey. 09/19/2002 Forrester Research launches ninth TechRankings category: Business Process Management. 09/18/2002 Advanced weather model running on SGI systems used to predict dispersion of hazardous aerosols and gases. 09/17/2002 SAS Enterprise Miner to support PMML; SAS and IBM simplify the management and deployment of data mining. 09/17/2002 Applix announces first Applix Integra customer win, additional wins for Applix iTM1, Applix iEnterprise. 09/17/2002 Decisioneering attacks corporate market with new services leader, Dr. Johnathan Mun. 09/17/2002 To create `paperless' office, Doctors started from scratch: opened new office with software provided via Internet. 09/17/2002 New MindManager add-in enables teams to build visual project plans - then export to leading project management tools. 09/17/2002 Cognos prescribes business intelligence solution for Markham Stouffville Hospital. 09/17/2002 Oracle(R) Java and web services tools leadership confirmed by developer community and industry press. 09/16/2002 Information Builders announces a no-fee SEVIS compliance analysis for higher education institutions. 09/16/2002 AMR Research reports SAS leads business intelligence/analytics market. 09/16/2002 Nobilis Software announces Nobilis Ci ProcessWriter for the desktop. ************************************************************ Visit DSS News Sponsors - Crystalball.com and Teradata.com ************************************************************ DSS News is copyrighted (c) 2002 by D. J. Power. Please send your questions to daniel.power@dssresources.com. You have previously subscribed to the DSS News Mailing List. |