[Historical Document]

University of California, Berkeley NASA End to End in EOSDIS

Our New World Order...
A Prototype Implementation.


This is a ROUGH DRAFT under heavy construction. Please email feedback to the coordinator noted in each section. Please send general suggestions to rtroy@postgres.berkeley.edu.

Last update: March 24, 1995

Section contents

Introduction to our Prototype

As a part of the UCB portion of the National Aeronautics & Space Administration (NASA) Earth Observing System (EOS), Distributed Information System (DIS), "End to End" project, we have built a prototype system based on a database and a small set of application tools. (Our prototype will be growing in scope and power as time moves on; Watch this page for future developments.)

We call our prototype The New World Order Project because of its ability to bring order to a world of datasets. One of the key goals of this project is to provide the ability to "join" disparate datasets to researchers working in diverse fields, which were not previously possible. We are bringing together into one database such diverse datasets as satelite imagry, wetlands data, ariel photographs, 35mm terestrial photographs, newspaper articles, census data, climatology, river and stream hydrology, and a host of others.

Aside from academic research, such a system would have uses in the society at large. One example question which cannot be easily answered without such a system might have to do with public policy: "Given our drought, just how important are water intensive crops like cotton and rice in preserving the ecology necessary to sustain migrating waterfowl who have lost their natural wetlands habitat in this region?" Or, "What sensitive resources such as schools or endangered specie habitat exist within the geographic domain of the Environmental Impact Reports submitted by this corporation?" Questions like these require sophisticated integration of diverse datasets, and that is part of what our project is all about.

While The New World Order Project cannot answer these questions today, it is our vision that out of our work will result a system that can. For more information on our whole project, please see our parent page.

Our Database Engine

We are implementing our database in a very powerful Database Management System (DBMS) which combines the "Object Oriented" and "Relational" paradigms, and permits SQL access to all data. This database engine is Illustra, a commercial version of the Postgres DBMS, developed here at UCB as a research project by Michael Stonebraker.

The combination of the "Object Oriented" paradigm with the Relational Database Management System (RDBMS) model (thus creating an "ORDBMS") is very important for our project for several reasons:

And, the Illustra ORDBMS offers other important features:

Our Database Schema

Our database is based on the Big Sur - Sequoia schema , which in turn is based on the Federal Geographic Data Committee (FGDC), Standards for Digital Geospatial Metadata, with a few extensions. As the Big Sur schema is well described on its own page (follow the link above), here we focus on the extensions added for our Prototype. It should be noted that our extensions may not have been necessary; We created them out of expediency. Through an ongoing evolutionary process, we are improving our implementation to avoid such extensions, and will provide feedback to appropriate Standards committees on what we've learned.

We created a "Reference" schema to house some of our extensions to the database design. (In Illustra, you can have multiple schemas in a single database.) The Reference schema contains only those items which can be thought of as providing "reference points" from which to evaluate the location of other objects. Some examples are a world map, and State and County/Parish boundaries.

We also created a "GCM" schema for Global Circulation Modeling. The GCMTest schema houses that data which is unique to GCM work. In testing various approaches to GCM visualization, we decided to try our ideas out in an isolated schema. We haven't yet attempted to reconcile the GCMTest schema with the Big Sur schema. (...We're not through playing with it yet...)

One GCM visualization issue for us was the actual drawing of representative shapes indicating detail such as wind direction. This activity can be done dynamically at run-time as a deriveable attribute, but there is a significant performance cost for doing so. So, we ended up creating the various shapes "staticly" instead, and had to store them somewhere - a private schema seemed the right choice. Reconsiling this work with the Big Sur schema is an important area for further improvement.

Our Application tools

Our application tools include a combination of SQL (Standard Query Language), TCL/TK, and an Illustra tool known as the Object Knowledge Browser (OK Browser). While SQL should be familliar to a wide range of individuals with computer exposure, the other two may not be familliar. TCL/TK is an interpreted scripting language that has GUI (Graphical User Interface) capabilities. The OK Browser is a new type of application development tool in which the programmer writes virtually no "code." Instead, the OK Project Editor presents a palate of choices. The programmer chooses from the palate, places these objects in a "canvas", and joins the chosen objects input and output ports to each other in meaningful ways.

The result of combining these tools is that we were able to create our initial application in about 10 days! We used the OK Browser as the initial viewing tool. When objects (items) are selected, TCK/TK may be brought up and used to view the selected object in detail. If desired, the user is given the opportunity to access SQL directly to further their inquery into the data available. The power handed to the user is considerable, and the data is largely at their complete disposal.

More importantly, however, once diployed, we want our users, who we do not insist be highly computer literate, to either create their own applications or modify the ones we provide to their liking. It is very important that our users be able to tailor applications to meet their specific needs, and at the same time, it is important to offer them high-productivity tools which are easy for them to learn, as they will typically not be Computer Programmers.

We believe our prototype system illustrates how such a system might be brought together. But we have just begun. We anticipate integration of a host of more specialized tools so that investigators may quickly move from one application environment to another, passing items of interest between the applications without difficulty.

As an example, when viewing the New World Order application, please note the " Grass rasters." These rasters (images) are generated by an application which is presently wholely seperate from this one. By integrating these two applications, we intend to permit the user to select and display Grass rasters via the New World Order application, and quickly move to the other application for a more specialized interaction with the data.

Our Data

Our data sets include a diverse collection, all of which share a few key attributes. Chief among these is a "spatial" element. The spatial element is usually a "geo-location" on the Earth. While we realize the Earth is indeed 3 dimensional, a more common representation is a two dimensional, spherical geometry based approach with an agreed origin: Latitude and Longitude! We use the latitude/longitude system for much of our data (much of it was available to us in this form), though we are not restricted to it. Future work will include the flexibility to support coordinates in a projected form, such as Albers Equal Area, which uses Meters Easting & Northing. But for now, we coerce such data into Lat/Long (decimal form, ie no minutes and seconds).

Spatial data can be of three forms: point, path, and polygonal:

Our data comes to us from several sources and we are constantly adding more. Presently, our data sets include:

Our Prototype

Our prototype is called the New World Order Project, and is based on the items outlined above. One enters the environment by running the "OK Shell." The OK Shell runs "X" so the user, the program running OK and the database engine may all be on different systems! OK starts by bringing up the first "Recipe", and from there the user may run different recipes simply by clicking on them.

The following is a sample of what you would see as you enter the application. The default is to see a large swath of the Earth, including the whole eastern seaboard of the US:



Acknowledgements

P. Brown (UCB) and R. Troy (UCB)

Section Coordinator: Richard Troy, rtroy@postgres.berkeley.edu