Why an Object Database and not a Relational Database ?


Why an Object Database and not a Relational Database ?

Many recently deployed ODBMS (Object Database Management System) developed applications involve over a gigabyte of data and hundreds of users. Financial institutions are a good measuring factor when judging the applicability and usability of new software methodologies. Financial institutions have the funds to invest into the learning curve required for the implementation of new technologies. Look at the New York Times Help Wanted classifieds section. NYC is looking for Java and C++ people. There is also some movement towards object databases in this region. Java and C++ are much more easily mapped to object rather than relational databases since Java and C++ are object application development environments. Therefore it appears that object databases are being utilised for larger and more demanding implementations. This shows that object databases are more capable and more scaleable. Object databases are more scaleable since they are more able to handle much larger quantities of data. Object database technology is now becoming a serious factor in development. The mid to late 1990's has seen many significant applications implemented successfully using object databases.

Note that a database of over a gigabyte of data is a very large database. A relational database with this amount of data can be very complex and probably has to be distributed over multiple servers. This should be an indicator showing that relational database technology can not cope with large amounts of data efficiently. Perhaps ?

The internet is set to expand from around 40 million to over 225 million users over the next three years. And those are internet surfers, not including email only users, and those are the internet surfers in the United States alone !

How much efficiency and scaleability in terms of database storage size is currently installed relational database technology going to provide to this exponential increase in web usage ? The simple fact is, an object database is more efficient and more compact than a relational database, as long as developers do not try to impose a relational approach onto an object database. This will happen. It is all part of the learning curve. The introduction of new technology is always a painful process, particulary for the personnel undergoing change. New technology is usually executed when desperately required due to costing requirements. And it is always financially expedient to rely and get the most use out of old and trusted technologies.

Oracle is supposed to be capable of this form of scaleability. I have also heard Oracle DBA's state that Oracle 8i is an object database. A relational database can not by magic become a relational database. Oracle would have to be completely rewritten, a monumental task when other companies are already at the point of having industrial strength production object databases boxed, wrapped and generally available with all the front-end tools and bells and whistles.

Oracle 8i being an object database is probably a physical and logical impossibility since the internal structure and basic methodologies of theory between relational and object database technology are completely different. Note that Oracle is not an object database. Oracle is an object-relational database. An object-relational database is a relational database which has some object features included. In my experience I have not come across a single Oracle installation which actually uses these newly included object features.

The intention here is not to denegrate Oracle, or any other relational database. There are probably more Oracle installations worldwide than all other relational databases put together. If so many people and companies use Oracle then Oracle must be the best and most versatile relational database product available. Many DBA's would disagree, myself once included but from the point of view of management, Oracle personnel with appropriate skills are available, at a price of course, but available. Oracle has been tremendously successful in marketing its product and servicing its clientele. However, are the relational database vendors perhaps igoring object database technology as network and hierarchical database vendors ignored the advent and possible ascent of relational database technology in the mid-1970's and early 1980's ? Are relational database vendors ignoring object database technology at their peril ? Are not the relational database vendors the best placed organisations to be capable of development of new object database technology ? Some vendors have done so already. Computer Associates, owner and marketer of the Ingres/OpenIngres relational database, have a production object database called Jasmine. To me Jasmine looks like quite a seriously hot piece of software. The core of Jasmine, the object database itself, is an object developers dream come true ! The biggest problem is the change in approach from relational to object. Most developers are not aware that an object database is much more than an a database repository. An object database is also in itself, a processor; much like relational database stored and database procedures. Object database methods are far more powerful than stored or database procedures for two reasons. Firstly, there is no restriction to SQL commands only and secondly, methods are attached to objects and can thus operate on much smaller amounts of data since methods specialise. In a relational database stored or database procedure SQL commands apply to tables in the respect that even an SQL where clause still has to reference its initial point of reference within a whole table.

The Evolution of Database Management Systems

Over the last fourty years many changes have occurred with DBMS technology. ODBMS's are the latest evolution. Many older database technologies are still in use, for instance, Hierarchical and Network DBMS's. ODBMS's will not neccessarily replace existing DBMS installations; this did not occur with the introduction of relational DBMS's. Every change present the same questions.

Relational databases perform a specific task and perform it adequately if not well. Object database technology can improve performance on Relational technology where complex data is involved. The same questions were asked about Relational databases in the early 1980's as are being asked about Object databases today. Most people prefer what is tried and tested. One knows the older technology works properly and also feels more comfortable with what is already understood. Only the most pioneering of spirits indulge in trying something new when there could be an unknown number of man hours of development and maintenance at stake. It is also in the nature of human physchology to resist change.

What to Consider when Choosing between a Relational and an Object Database

Object concepts are contrary to the experience of people with Relational DBMS experience. It has apparently been shown that correctly constructed object implemetations can be upto 100 times faster than their relational DBMS constructed counterparts. Object implemented systems also require much less code since there is extensive re-use of code placed in methods due to abstraction and inheritance. This is because the structure of an object hierarchy itself can automatically cater for many of the additioanl and generally excessive additional entities and connections required in a relation structure.

The advantage of object over relational databases is that object databases are far more effective at handling complex data.

Object Model Concepts

  1. Data Abstraction

    In a real-world situation a large amount of information must be purified and reduced down to its basic essentials. This is the process of data abstraction. Abstraction is the construction of an abstract model of a real-world business environment which meets the requirements of that business environment.

    1. Object Identification (OIDs)

      In the object model every object instance has a unique identity. This unique identity is said to be persistent in techie jargonese. By persistence we mean that that object identity can never be altered. That object will always be accessible by using that unique identifier. The object will never go astray unless deliberately deleted. Unique object identifiers are also independant of the data contained within an object. Thus object data can be altered without affecting the unique identifier.

    2. Relationships

      In a relational model, relationships have to be constructed. This is why a relational database is called Relational. Relational database technology is based on mathematical set theory; sometimes it is called relational mathematics. Entities are related to each other by way of relationships stored within those entities. Object databases directly support relationships based on object structure. In other words object databases automatically support relationships between objects, no additional entities or in the case of an object database, objects, are required to support inter-object relationships as in a relational database.

    3. Classes

      Object instances are grouped together into a classes. Classes are groupings of objects have similarities. Two interesting factors come to ind at this point.

      • Specialisation of a class - A new class is always defined in terms of another class. This means one could create a number of subclass specialisations of a primary class. These subclasses represent different types of the primary class. Specialisation removes the necessity for type codes in an object database. Relational databases rely heavily on type codes and much time-wasting cross-referencing between type codes, type code entities and the actual data of the required type in addition to higher storage space requirements. Specialisation also allows polymorphism. Polymorphism is the name given to process of applying differing functionality in the form of methods to specialised subclasses. The subclass method has the same name as the superclass method.

      • Abstraction of a class - the most interesting aspect of abstraction other than that of data abstraction is that of abstraction of functionality. Personally I prefer to distinguish between data and functional abstraction.

        • Data abstraction implies the reduction of data itself into its most basic constituent parts, similar to normalisation in relational databases. However, much less complex in object databases since object database data abstraction appears much closer to real-world situations than that of relational database normalisation.

        • Abstraction of functionality - this functions in the opposite manner in which polymorhic method specialisation of subclasses does. This implies that method functionality can be generalised from specialised subclass upto parent classes, therebye reducing code and obviously potential subsequent maintenance costs.

  2. Encapsulation

    Encapsulation refers to the concept of including processing or behavior with the object instances defined by a class. Encapsulation allows code and data to be packaged together. Data is defined within the attributes of an object. Processing or functionality in included within an object by use of methods defined for its defining class.

  3. Inheritance

    Inheritance is the means by which one class is defined in terms of another class. A subclass is defined in terms of a parent class to produce a subclass specialisation of the parent class.

  4. Reuse

    One of the most important design challenge of effective design for class definitions is ensuring that classes and the encapsulated methods can be reused. Reuse does not neccessarily imply resuse of classes in future implementations of similar software. Reuse should generally imply ability of an object design to abstract data (attributes) and functionality (methods) upwards from specialised to parent classes.

Complex Data

Complex data is better dealt with in an object database than it is in a relational database. One of the most important results of object methodology implemenation is that of the burial of complexity within the different levels of an object hierarchy. Thus we will now show firstly, what is complex data and secondly, how would an object design resolve this complexity better than a relational design ?

How to identify the existence of complex data in a real-world business or commercial environment ?
  1. Lack of unique identification

    National Insurance numbers were created in the United Kingdom after the World War II as a way of uniquely identifying people. People who died the end of World War II can only be found based on relationships with family members alive at the point of creation of National Insurance numbers. Object database object identifications (OID's) provide unique and persistent (permanently stored and not alterable) identification of otherwise inaccessible information. In a relational database persons without Natioanl Insurance numbers are not accessible uniquely. Therefore in a relational database one can only access all deceased family members of a specified person as a group of many persons since there is no unique reference.

  2. Many-to-many relationships

    The relational model does not allow for repeating groups which occur in many-to-many relationships. These many-to-many relationships must be reduced (normalized or simplified). Reduction of many-to-many relationships is required in a relational schema otherwise those items can never be accessed uniquely. Thus two entities which are related to each other on a many-to-many basis are normalised into three and not two entities. This is cumbersome. An object database can store many-to-many relationships as an included part of the data abstraction process.

  3. Data access using traversals

    Access using traversals occurs where there is traversal of data instances across a database. An object structure can handle data traversal much more efficiently than a relational structure since an object database is stored in a hierarchical fashion of connected relationships inherent within the object structure itself. A relational database would have to store numerous literal values to show what is related to what. A traversal is in general a search through the nodes of a tree-like structure. A tree-like structure is a hierarchical structure where a child node is directly accessible from a parent node and visa-versa. Parent and child (inherited) classes are directly accessible when traversal is required. A relational database would have to explicitly store links between parent and child nodes and provide explicit search procedures between entities storing those links. Cumbersome !

    Depth first traversalWidth first traversal
    Depth First TraversalWidth First Traversal

  4. Frequent use of type codes

    Frequent use of type codes requires specific application code processing plus extra data stored in order to classify object instances within a relational database. An object database has classes where object instances of those classes are identified as being within a class based on the fact that a class defines a type code. There is no need for type codes in an object database. A class is effectively a type.

Complex data is generally best handled by an ODBMS. Thus identification of the existence of complex data is paramount.