from DSSResources.com

************************************************************

                          DSS News
                     D. J. Power, Editor
              September 29, 2002 -- Vol. 3, No. 20
         A Bi-Weekly Publication of DSSResources.COM

************************************************************
       Check the Applebee's case featuring Teradata
************************************************************

Featured:

 * DSS Wisdom
 * Ask Dan! - What is the potential size of the NSEERS database?
 * What's New at DSSResources.COM
 * DSS News Releases

************************************************************

  DSS News is sent to more than 875 subscribers from 50 
  countries. Please forward DSS News to people interested
  in Decision Support Systems and suggest they subscribe.

************************************************************

DSS Wisdom

Newell and Simon (1972) argued human problem solving can be understood 
"by describing the task environment in which it takes place; the space 
the problem solver uses to represent the environment, the task, and the 
knowledge about it that he gradually accumulates; and the program the 
problem solver assembles for approaching the task (p. 868)."

from Newell, A. and H. A. Simon, Human Problem Solving, Englewood 
Cliffs, NJ: Prentice-Hall, Inc., 1972.

************************************************************

Enhance model-driven DSS with Crystal Ball simulation software.
Download a FREE evaluation at www.crystalball.com/dss

************************************************************

Ask Dan!
by Daniel J. Power

What is the potential size of the NSEERS database?

The U.S. Immigration and Naturalization Services (INS) is creating a 
number of very large databases to support a variety of operations and 
processes. In addition to the NSEERS transaction database discussed two 
weeks ago (DSS News, Vol. 3, No. 19), INS is developing the Student and 
Exchange Visitor Information System (SEVIS).  SEVIS is an Internet-based 
system that will be accessed at U.S. Ports of Entry and by more than 
1900 schools and colleges. Biometric border crossing cards will also be 
required of Mexican border crossers as of October 1, 2002.  These 
projects are massive in scope and it is not clear how data will be 
shared between systems. This Ask Dan! continues the discussion.

How do we "size" these databases? Is a data warehouse needed? What 
platform and software is needed? How will this data collection effort 
support decision making at INS? Some of these questions were addressed 
in my prior column, but Marc Demarest, former Chairman and CEO 
DecisionPoint Applications and current President of Noumenal 
(http://www.noumenal.com/marc/), offered his analysis and insights and I 
accepted.  Marc began with my assumptions for the NSEERS transaction 
processing system -- 35 million visitors per year, 45 KB for a 
fingerprint, 10 KB for a photo and 5 KB for alphanumeric string data per 
visitor.

Marc writes "I think we need to double or triple your alphanumeric 
string count. For US citizens and green card holders, the 5K is probably 
right, but have you seen the forms foreign nationals have to fill out? 
They're huge, and, after all, this is John Ashcroft we're talking about, 
so, by the time we add in the foreign national's 'home information' and 
all of the data on where the foreign national will be traveling in the 
US and what they will be doing and who they will be seeing, I'd bet the 
foreign national alpha information tops 20K easily. So let's say 12K as 
an average for alpha data."

Second "I didn't see a discussion of what would be captured on exit, but 
something will surely be captured, yes? That would be the easiest way to 
catch visa overstays. And if Ashcroft is worried about people 
masquerading as other people, he'll capture about as much on the way out 
as he did on the way in. Let's say a photo, a fingerprint, and 5K on the 
way out."

"Now, how would such an application work? The 'decision support' is 
going to be largely automated, I'd imagine. I'd assume this application 
is going to (a) compare photographs, (b) compare fingerprints, (c) 
analyze the alpha data provided and then (d) weight the outcomes and (e) 
make a recommendation to the official at point-of-capture based on that 
data. That is the system model (at least I hope it is). Now, they may 
also do other things with the data, including uploading it into other 
(INS, NSA, DCA) systems for other kinds of analysis, but this is a 
closed loop capture-and-analyze system, I'd bet, not two systems: (1) a 
transaction processing app and (2) a DSS app that is loaded from the TP 
app according to a schedule."

"Since the data in the database is analyzed programmatically and not by 
a person, it doesn't have to be inherently legible at the schema level, 
so we're probably not talking a 'star' style schema -- we're talking 
some kind of normal form schema. Raw data loaded into any schema creates 
a loaded set size larger than the raw data -- because of DBMS storage 
mechanisms, indexing overhead, etc. They'll have to be using a 
conventional RDBMS because this is a INSERT-AND-QUERY system: Teradata- 
and nCube-style DBMSs and OLAP engines won't take the INSERTs fast 
enough or elegantly enough. For a star implemented in a conventional 
RDBMS, one usually sees a 2.5X growth in the raw load set size once it's 
loaded, and I think it'd be close to the same growth factor in this 
case: maybe 2.8X. We're also not talking about needing to maintain a lot 
of history in this system -- the system-of-record for pictures and 
fingerprints will be elsewhere, because that system will be used for 
multiple purposes, and data will be migrated out of this system into the 
real 'data warehouse' for the Department of Homeland Security as soon as 
an individual alien's entry-exit loop is closed, so I'd bet there will 
never be more than the equivalent of 1 year's worth of data in the 
system. In other words, this system won't be extracted INTO; it will be 
extracted FROM."

As an aside Marc noted "The most difficult technical bit for the system 
will be indexing strategy: the more indexes they add to cut query time, 
the longer insert time will take. The fewer the indexes, the longer the 
complex set of queries they are going to have to run will take."

Based on his assumptions and analysis, Marc calculated NSEERS is "about 
a 12 TB system ... easily within the range of Oracle running on a nice 
cluster of Sun or IBM SMP/NUMA boxes." Marc concludes "You're right, 
however, that in the final analysis it will be hard to implement." 
Thanks Marc for letting me quote so extensively from your analysis.  A 
12 Terabyte database is huge and the more I reflect on the INS projects 
the more I can see the databases expanding in size.  Perhaps these two 
Ask Dan! columns will stimulate more thinking and discussion about the 
important decision support issues associated with monitoring visitors to 
the United States. These new INS systems are mission critical and the 
projects provide us the opportunity to think innovatively about 
providing decision support from very large databases. As always your 
comments and questions are welcomed. If you want a challenge, reflect on 
how you would would support decision making at INS. If you teach DSS or 
database, try asking your students the questions raised in DSS News, 
Vol. 3, Nos. 19 and 20. 

References

Demarest, Marc, Email message, Monday, September 16, 2002 at 09:38:31.

Power, D. J., "Is it feasible to track all visitors to the United States 
and then build a Data-driven DSS?" DSS News, Vol. 3, No. 19, September 
15, 2002.

************************************************************
 Visit DSS News Sponsors - Crystalball.com and Teradata.com
************************************************************

What's New at DSSResources.COM

09/22/2002 Added materials to Power, D. J. "A Brief History of Decision 
Support Systems", saved as version 2.1, September 2002, URL 
DSSResources.COM/history/dsshistory.html.

09/19/2002 Posted case by Teradata Staff, "Understanding customers' 
preferences at Applebee's International", Teradata, a division of NCR 
Corporation, 2002, URL DSSResources.COM/cases/.

************************************************************

Get information about Dan Power's book, Decision Support 
Systems: Concepts and Resources for Managers, at 
http://www.dssresources.com/dssbookstore/power02.html .

************************************************************

DSS News Releases - September 15 to September 27, 2002

Complete news releases can be found at DSSResources.COM.

09/27/2002 Call for Papers: ICEIS 2003 - 5th International Conference on 
Enterprise Information Systems, Angers, France 23-26 April, 2003. Paper 
deadline October 15, 2002.

09/26/2002 Microsoft unveils the Center for Information Work. 

09/26/2002 Q&A: What is the Microsoft Center for Information Work? 

09/26/2002 Stellent unveils vision for the future of content management. 


09/26/2002 Harrah's selects TIBCO for business integration platform.

09/26/2002 Application outsourcing the most efficient and cost effective 
method of software implementation, IDC system dynamic models prove.

09/25/2002 New Network Computing study finds third party remote access 
providers reduce management burdens, save money. 

09/24/2002 Schwan's selects Intermec handheld computers and mobile 
printers for nationwide route sales.

09/24/2002 The emergence of the Internet "Mainframe" -- the WebFrame -- 
will drive infrastructure growth says NetsEdge Research Group. 

09/24/2002 Ford selects SGI Reality Center technology for visualization 
and design optimization. 

09/24/2002 Sun Microsystems is honored with the Helen Keller Achievement 
Award for its leadership in accessibility advancements. 

09/23/2000 Leading analyst research finds Business Objects number one 
business intelligence tools vendor in Western Europe. 

09/23/2000 Jones & Stokes uses eRoom hosted enterprise service to manage 
environmental planning projects.

09/23/2000 Sybase introduces first comprehensive healthcare integration 
suite built on open standards. 

09/23/2000 Teradata profitability analytics bolsters the bottom line for 
mobile communications companies.

09/23/2000 Netezza unleashes tera-scale data appliance for Business 
Intelligence. 

09/23/2000 Teradata signs worldwide reseller agreement with Informatica; 
will empower customers with integrated decision-making by linking 
operational and analytic capabilities. 

09/23/2002 GeoSpatial World 2003 enhances exhibitor opportunities to 
reach GIS, IT, and mapping decision makers. 

09/23/2002 ProClarity Corp. fastest growing Business Intelligence vendor 
named to Software Magazine's 20th Annual Software 500. 

09/23/2002 China's largest coal company deploys Datastream enterprise 
asset management solution. 

09/23/2002 Nortel Networks introduces SSL-based secure extranets for 
enterprise customers; enables secure connectivity for remote users 
equipped with web browsers. 

09/23/2002 Microsoft delivers new migration and coexistence tools for 
Lotus Notes applications. 

09/23/2002 Wrigley selects SAP as global business systems platform.

09/23/2002 International Biometric Group releases Biometric Market 
Report 2003-2007. 

09/23/2002 SAS(R) solution adapters for SAP reduce time to Business 
Intelligence.

09/20/2002 Call for Participation: 2003 Information Resources Management 
Association (IRMA) International Conference. Submission Deadline: 
October 4, 2002. 

09/20/2002 Extreme Networks demos production 10 gigabit ethernet 
infrastructure for next generation network services at Sun Conference. 

09/20/2002 Optiant customer Imation wins Start Magazine's Technology and 
Business Award. 

09/19/2002 CIOs report on Information Technology's hottest jobs in 
semi-annual Robert Half survey. 

09/19/2002 Forrester Research launches ninth TechRankings category: 
Business Process Management. 

09/18/2002 Advanced weather model running on SGI systems used to predict 
dispersion of hazardous aerosols and gases.

09/17/2002 SAS Enterprise Miner to support PMML; SAS and IBM simplify 
the management and deployment of data mining. 

09/17/2002 Applix announces first Applix Integra customer win, 
additional wins for Applix iTM1, Applix iEnterprise.  

09/17/2002 Decisioneering attacks corporate market with new services 
leader, Dr. Johnathan Mun. 

09/17/2002 To create `paperless' office, Doctors started from scratch: 
opened new office with software provided via Internet. 

09/17/2002 New MindManager add-in enables teams to build visual project 
plans - then export to leading project management tools.

09/17/2002 Cognos prescribes business intelligence solution for Markham 
Stouffville Hospital. 

09/17/2002 Oracle(R) Java and web services tools leadership confirmed by 
developer community and industry press.

09/16/2002 Information Builders announces a no-fee SEVIS compliance 
analysis for higher education institutions.

09/16/2002 AMR Research reports SAS leads business 
intelligence/analytics market. 

09/16/2002 Nobilis Software announces Nobilis Ci ProcessWriter for the 
desktop.

************************************************************
 Visit DSS News Sponsors - Crystalball.com and Teradata.com
************************************************************

DSS News is copyrighted (c) 2002 by D. J. Power. Please send your questions to daniel.power@dssresources.com.  You have previously
subscribed to the DSS News Mailing List.  

DSS Home |  About Us |  Contact Us |  Site Index |  Subscribe | What's New
Please Tell 
Your Friends about DSSResources.COM Copyright © 1995-2021 by D. J. Power (see his home page). DSSResources.COMsm was maintained by Daniel J. Power. See disclaimer and privacy statement.