Next Page Prev PageMain Index Home


Content Addressable Memory
No data questions, just answers

MPbase's internal structure is in part based on the concept of content-addressable memory (CAM). In this information-handling model, each possible piece of information has one and only one possible storage location. The data is its own key. It is important to differentiate CAM from a hash key or traditional index.

With conventional indexing schemes the data content is used with a hash or index to produce the address location of the data. The address has no real or direct relationship with the information contained in the data. With CAM, the data describes its own storage location. This also means all like data will always be found close together in the physical data structure. There is a direct relationship between the information in the data and its location in the physical data store.

What can this mean for a database? First, any piece of the data can be used to narrow the search area. Second, all similar data will be found in close logical proximity. This means that the data logically closest to any row will be, by definition, the most similar to it. This makes analysis of the data a much simpler task. It also speeds the updating process.

With CAM, the final location of any row is predetermined. Normally the determination of the physical placement is a time-consuming task. This is why most databases use a separate process for loading the database. The load process can make bulk decisions, and so run much faster than the normal update cycle. With a CAM-based database the data will, to a large extent, "sort itself out" during the load process.

With an appropriate database architecture using the CAM model, there is no need for a special load process. The update cycle is just as efficient as a batch load process would be. Therefore this type of database is perfect for mass streams of data that will need to be analyzed. Not only will the update run like a batch load, but a good portion of the processing needed to analyze the data is already done.

Why isn't CAM the standard model for database architecture? The physical storage models currently in use require the data to have certain characteristics. The tight coupling between logical and physical location in current DBMSs requires data to be uniformly spread across the physical media. To allow the data to "bunch up" is highly undesirable because it creates choke points.

In order to use the CAM model, it was necessary to use a physical storage schema that could benefit from this bunching up. MPbase does just that.


Next Page Prev PageMain Index Home © 1998-2004 NPSI