TOPLink

A Step beyond Object to Data Mapping

Richard Deadman

Abstract

As Java moves beyond cute dancing applets to become a platform for distributed multi-platform computing, the race to supply the infrastructure services is starting. This includes not only distributed "CORBAish" management services (Naming, Event Channel, Security, Trader, Administration) but also facilities for managing the data which is centric to corporate applications. Several Object to Relational Database mapping frameworks are now becoming available, but The Object People's "TOPLink/J"^tm goes beyond just mapping the data to also managing it.

Modern OO language meet entrenched Relational database. You've still got one, haven't you? Object-oriented databases have been around for years but are not always an available solution, sometimes for legacy reasons, sometimes for perceptions of immaturity, sometimes due to available skill set reasons, sometimes due to a lack of a mature OODB query standard. Whatever the reason, the call has been made -- a relational database it is.

You assign your best modellers to designing the logical object model and sit down yourself to solve the physical issues -- object mapping, transactions, caching, object handle faulting, object identity. Suddenly you realize that you are going to have to build your own one-off proprietary pseudo-OO database on top of JDBC. And doing so will involve 40-60% of the non-gui coding effort. Surely, hopefully, someone else has run into this before.
A sure indicator of the seriousness which with Java is entering the corporate IT market are the number of Java Object to JDBC table mapping tools that are emerging on the market (Sun's upcoming JavaBlend, Thought's CocoBase, Novera's EPIC dbBlend, ChiMu's FORM, O2's JRB, Objectmatter's BSF, Software Tree's JDX, 2Link Consulting's dbGen, CrossLogic's Universe). Some are new, some available for free, but one of the most complete turns out to be a product that isn't new at all, not at its roots anyway. The Object People's TOPLink/Jtm is a both new and old. While not perfect, it offers a wealth of features that makes the management of data in serious corporate applications much easier. Presently in Beta (as of February, 1998), TOPLink/J, if priced similarly to its Smalltalk cousin, will not be a cheap product but will pay for itself in development time, effort, stability and maintainability. The key to it's advanced standing is that The Object People have done it all before.
Before Java was a coined name for Sun's Oak project, The Object People were mentoring the building of pure OO Smalltalk systems by large corporate customers. They saw a need for a tool to map objects into and out of the legacy relational databases that were firmly entrenched in many customer's sites. And so, for over five years, they have been selling a tool for the Smalltalk market called TOPLink. In this time, they extended the tool and its framework to support the management of objects as well as the mapping of them to database tables. As a result, TOPLink supports not only object mapping, but caching, object identity, faulting, transactions, units of work and some three tier support.

Object Mapping

The price of all this flexibility is, as always, a bit of extra complexity. While the direct mapping method is simple to use, understanding all six and when to use them will take some work. Unlike TOPLink/J's ancestor, TOPLink for Smalltalk, TOPLink must work within the confines of the Java language. This means that since TOPLink is simply another class library, it must rely on public instance variables or public accessors for those variables. The builder tool at present does not support the most complex transformational mapping scheme, so this must by specified through the Java API calls of the framework. As well, the builder tool has the unfortunate habit of forgetting your accessor method and defaulting to direct variable calls at times.

Caching and Object Identity

To ensure that the objects in the cache stay in synch with the database, TOPLink allows the use a version field. The framework can then query the database on the version number and determine whether it must update a cached value before returning in to the user. As is the pattern with TOPLink, several database consistency check options are provided. For instance, if you know that no other programs are updating the database, version consistency checking can be turned off. At the other end of the spectrum, the database can always be read, in which case the cache is only useful for preserving object identity.

Faulting

So your object model specifies large trees (or even cyclic graphs) of object relationships. TOPLink can do the mapping to read these graphs out of memory, but what if you just want to view the widget description without also reading (at considerable cost) all the other objects which are reachable from an instance of Widget? One way is to replace your object pointers in your object models with indirect references which can be used to find related objects using a directory or factory object. A more transparent and elegant way, however, is to use a pattern found in other Object-oriented repositories, such as Gemstone. The trick is to use a place-holder proxy which "faults" in the rest of the tree when it is first accessed. TOPLink for Smalltalk leverages off of some features in that language which are not to be found in Java. Gemstone accomplishes this in Smalltalk and Java by providing their own VM which creates a faulting proxy transparently. Since TOPLink is designed to work on all Java VMs, the designers were forced instead to use a less transparent mechanism and require the object model to use an instance or a ValueHolder instead of the referenced object within the object's state (instance variables). This valueholder is then hidden from the object model by resolving the valueholder to the real object within an accessor method.

Transactions and Units of Work

Perhaps more interesting are the facilities TOPLink provides for Units of Work. A unit of work is much like a transaction except for two points:

The changes are made to deep object copies, meaning that changes will not be seen by other parts of the system sharing the same objects until the changes have been committed.
At commit time, the changed object graphs are searched against the originals for the deltas and only the actual changed fields are written to the database.

Three Tier Support

Finally, the Beta of TOPLink for Java contained some support for three-tier access to the database. This support allows for multiple clients to hook to a database through a TOPLink server using different sessions with different security permissions. A single read session is used on the server to allow for a unified cache. While this is useful, managing objects from a server to a client is a similar problem to managing the object from a database to a server -- in both cases caching, faulting and object identity need to be provided.

Limitations and Drawbacks

The lack of a pessimistic locking scheme means that there is no way to guarantee that a transaction can be started that will proceed (there never is, but with pessimistic locking, transactions only fail due to network or database failures, not data collisions). While optimistic locking allows TOPLink to avoid nasty lock management issues, it does mean that applications may be forced to make users re-enter whole datasets at times.

At present, TOPLink throw only runtime exceptions -- a legacy of its Smalltalk roots. While this makes sense for true unpredictable errors, such as faulting-in errors, I would prefer to see transaction and database errors forced to be caught. As well, its current support for distributed computing could be enhanced to support client-side units-of-work caching and faulting..

Conclusion

The purpose of this article is not to compare the features, costs and differences of all the available relational to JDBC products presently available. Such a comparison will of necessity be biased in some way and must be made depending on the needs of each project. Instead, the article is a review of one of the most complete products, suitable for industrial-strength Java solutions. While TOPLink is not perfect, it offers data management services that are more sophisticated that those of other products in its class. Depending on the persistence sophistication required by your project, TOPLink may or may not be worth the licencing and training costs. It offers a wealth of features and good defaults, allowing the normal things to be done simply and the abnormal things to be possible. For a project that requires such services, TOPLink will quickly pay for itself. It certainly sets a standard that other products are sure to want to follow.