Next Page | Prev Page | Main Index | Home |
First, let's establish a working analogy for meta-data. Meta-data
can be viewed as the information on a library's card catalog.
This represents a very small but common subset of the information
contained in the complete library. The catalog's information subset
can be very useful both on its own, and as an index to the main
body of information. The task of creating or changing such a catalog
is a large and costly one. This cost severely limits the usefulness
of a card catalog or any set of meta-data.
Now, what if you could use a "magic" card catalog? One
where all you needed to do was define the card content and start
your search based on your own definition of what should be on
the cards. This is virtual meta-data. A database view containing
summary information about the main body of the database. A view
which, like meta-data, can be used on its own or as an index to
the main data store. A view that is painless to build or change.
When you start looking at computerized meta-data the analogy starts
to break down a little. In some systems the meta-data storage is
several times larger than the original data. This is where you
really need to start asking, "Is this the best way to work
with this data?" In the case of more meta-data than data,
there are normally millions of subtotals generated most of which
will never be used. They are there to support queries that may
or may not ever be run. The reason all possible subtotals are
generated is that, at the time of creation, there is no way to
know which subtotals will be needed.
So, why use meta-data at all? If the database is fast enough, as
in the case of MPbase, why not just work with all of the
data? Very few database applications can handle the totality of
data in a data warehouse. Even if the database can produce all
of the data in a timely manner, it would just overwhelm the application
(like filling a teacup from a fire hose). With MPbase,
meta-data is still required to reduce the data volume to the application.
MPbase can produce this needed meta-data on the fly so that
it requires no additional storage space.
This virtual meta-data is defined as a database "view".
A database view is nothing more than a way of telling the database
what you wish to see. A view can be thought of as a virtual database
containing a subset of the complete database. In addition, a view
may contain summary information from the database. In short, the
view is used to create the "magic" card catalog mentioned
above.
As an example, take a company with a 1-terabyte database and 3.5-terabytes
of meta-data. MPbase could reduce the 1-terabyte to 300-gigabytes
and eliminate the need to store the other 3.5-terabytes. The total
savings in this case would be 4.2-terabytes.
This virtual meta-data from MPbase is the perfect way to
use a massively parallel database with a "normal" application.
It allows the database to do what it does best and leaves to the
application the tasks that it does best.
One key aspect to using "virtual meta-data" is in the way you handle "big binary blobs" (BBB). This is a traditional database's way of handling data it does not understand: things like image data. MPbase can work with the information inside the BBB when creating the "virtual meta-data." This allows queries directly into the content of the BBB. In the traditional database environment, this would involve a separate application reading the BBB's and producing fixed meta-data.
Next Page | Prev Page | Main Index | Home | © 1998-2004 NPSI |