When in Rome

A Guide to the Java Paradigm

Richard Deadman

In this article, I discuss two classes of problems faced by developers. The first is what I call "Conceptual Confusion", where a user of one language carries their assumptions to another language and then gets confused when their assumptions are invalid. The second class is the desire for new features to be added to the language. This "Creeping Featurism" generally involves adding complexity to the language for the sake of mimicking another language's feature, often no more than syntactic sugar. That is, the proposed feature may reduce the typing without adding to the power of the language.

Let me warn you of my bias: I find too many features in a language confuses me. I find that a simple language based on a single paradigm provides for less confusion, better maintainability and quicker code development. You may understand that neat "constant reference" feature, but think of the person who may have to fix, reuse or extend you code a year from now.

Conceptual Confusion

Virtual Methods

class Foo {
    public String toString() {
        return "An instance of class Foo";
    }

class Bar extends Foo {
    public String toString() {
        return "An instance of class Bar";
    }

public class FooBar {
    public static void main(String[] args) {
        Foo myPersonalFoo = new Bar();

        System.out.println(myPersonalFoo);
    }
}

It should be up to the object to determine how messages sent to it will be handled. Note that this is the opposite to how static methods are bound in Java. More on this later.

Since the default in C++ is for non-virtual methods, some C++ programmers can inadvertently assume that early binding takes place and not then understand the behaviour of their code.

All the compiler checks is that the message is supported by the handle's object type. To send a sub-type's message to an object handle which does not support that message, you must cast the handle to the proper sub-type:

    import java.rmi.server.*;
    ...
    Object vectorContent = aVector.firstElement();
    if (vectorContent instanceof ServerRef) {
        System.out.println("Remote call from " + 
            ((ServerRef)vectorContent.getHostName()));
    }

Pass-by-Reference

This is identical to Smalltalk's parameter passing but contrasts significantly with the C++ allowed modes:

pass-by-reference. As in Smalltalk and Java, but involves passing the pointer to an object explicitly. The receiver must then explicitly de-reference the pointer. If you don't "get" pointers, you shouldn't be playing with C and C++.
pass-by-value. A copy is made of the parameter. In Java the equivalent can be accomplished by cloning the object before sending it.
pass-by-reference but seem like a value. This is the most confusing C++ addition, the "Reference". Here you tell the method that you are passing a reference to the object but you hide the syntax to make it seem like the object was passed as a value.

(Note: With RMI, the rules change for distributed objects. All parameters which are not themselves remote handles are serialized between the virtual machine, effectively implementing a pass-by-value paradigm for non-Remote objects.)

Class/Static Methods

addInstanceToCache
    self class addInstanceToCache: self.

addInstanceToCache"

Smalltalkers moving to Java often mistake "static" methods and variables as being equivalent to "class" methods and variables in Smalltalk. This is not strictly true. While "static" methods and variables are bound to a class and not an instance, they are resolved "statically" to the type of the variable (or named class), by the compiler. If the variable type and instance class are different, the static method or variable is not over-ridden at run-time by the instance's class. Hence:

class Bar {
    static String getName() { return "Bar"; }
}

class Baz extends Bar {
    static String getName() { return "Baz"; }
}

public class Foo {
    public static void main(String[] args) {
        Bar myBar = new Baz();
        System.out.println(myBar.getName());
    }
}

Here is one of Java's great inconsistencies: instance methods involve late binding but class methods involve early binding. Think of static as meaning "bound statically to the class at compile time". Now write it out fifty times.

A similar problem exists in the understanding of Interfaces...

No Static Interface Methods

But people are often confused as to why interfaces can define static variables and cannot define static methods. After all, what is the purpose of a static variable if there is not static behaviour? And why can't I use an interface to declare that all classes which implement this interface will implement these static methods?

Well, the answer gets back to understanding how static variables and methods work. They are bound at compile time against the class identified either by class name or variable type. For variables this means that you can actually use interface state:

char doneCharacterForMyText = java.text.CharacterIterator.DONE;

For methods, what does it mean to define a static method in the interface? Well, since static method calls are always bound to type of the variable, it means that any calls to these static methods from variables which are defined as instances of the interface will try to invoke the interface's static method -- and since by definition interfaces cannot have behaviour, this would be a problem.

Some argue that this is an argument for static methods and variables being dynamically bound, but that moves us into the next section...

Creeping Featurism

Here is a rundown of some of the features I have heard cries for within the last year:

First class methods

Justification:

Rebuttal:

Define a notification interface that the notifier understands and uses to notify clients. Now the object that needs to be notified can implement this interface and be passed to the notifier, or an Inner class can be used to create an adapter to the object that needs to be notified. This is the basis of java's Observer/Observable system as well. In fact, now we can provide more than one notification method simply, which otherwise would require sending multiple callback function pointers.

An added benefit of registering objects through an observable pattern is that "user data" does not have to be registered with the event source. This context information can be saved with the observer object without having to break encapsulation and expose the data to the observable, which has no intrinsic interest in the data. As well, a single instance can create and register multiple observer adapters for each context/event it is interested in.

Auto-delegation

Justification:

Rebuttal:

This is not to say that the automatic generation of code does not have its place. IDEs use code generation to create GUIs; the BeanBox uses code generation to glue components together using adapter classes. The difference is that here the code generation is part of a tool and not part of the language. An automatic interface delegate tool would be a useful addition to an IDE toolset. The generated code would be Java source code and could be intelligently managed within the scope of the application builder.

Support for constant parameters

Justification:

Rebuttal:

Cloning. Here you clone the object before passing it to ensure that the receiver has a copy that is decoupled from the object you are pointing to. This is expensive but much safer than the C++ "const" feature. Note, however, that by default cloning in Java involves a shallow copy. Shallow copy means that only the object is copied, not any objects contained within the object as instance variables. Since all all variables are handles to other instances or to base types, if the referenced objects are not also cloned, the original instance and its shallow copy will both contain handles to the same entities. So the original object and the cloned object I passed you will share any contained objects, leaving the possibility of some shared state. Implementing your own deep-copy (which recursively clones all state to some specified level) is recommended if this is a serious problem. A simple form of deep-copy is performed by Java's Serialization facility.
Protected interface. Write an interface which does not modify your object's state and implement the interface within your object. If the object is passed as a type of this interface, the effect is equivalent to the C++ "const" feature -- that is you have protection but it can be cast aside.
Protection Proxy (Gamma et al., pp. 207). Wrap up the object within a protection proxy object which checks and disallows certain access on the object. Here the whole interface is exported, but some methods will return a "Disallowed" exception. This also allows for capabilities-based dynamic authorization (i.e. user access can be more finely controlled). This protection strategy is probably the most secure, but may be the most work.

In/Out parameters

Justification:

Rebuttal:

As with many other issues on this list, this is an example of using OO patterns and techniques to solve problems instead of making the language more complex and incorporating the techniques into the language semantics. Kind of like RISC for programming languages.

Inlined getters and setters

Justification:

Rebuttal:

Note however, that inlining may speed up your code at the cost of large class files. A classic computing time-space trade-off. Large class files may mean greater download times and even, if your memory is running low, slower performance due to memory swapping. Optimizing your code is rarely as simple as it first seems and often has side effects (such as extensibility and support problems). There are many rules to optimization, but I like:

Don't do it unless absolutely necessary
Don't do it yet
If you do optimize, make sure you're optimizing the system bottlenecks (run your system through a call tracer)

Multi-valued return parameters.

Justification:

Rebuttal:

Enough with excuses. Multi-value return is sometimes useful. But if you are returning a collection of related objects, maybe you should take a more OO approach and encapsulate the data into an object. If the data is really unrelated and creating an encapsulation class doesn't make sense, you can always use arrays, Vectors or some other generic collection class.

Dynamic "static"

Justification:

Rebuttal:

Satiric Java Feature Request List

I first posted this list to a mailing list in December of 1996 in response to a thread advocating the addition of Lisp-style multiple return values to Java methods. Additional contributions have been noted.

Golly Gosh, now that we have spent time analyzing the syntactic sugar needed to add multiple-return values to the language, I have some other suggestions. I have programmed in Assembler, Machine (PDP-11), Fortran-77, COBOL, BASIC and other powerful languages and would like to add some other features to Java that will occasionally make it easier to think in my old ways.

Here are some features that Java needs now to become a serious programming language:

Goto statement. There's the code I want to do in that other class, so why can't I just:

goto OtherClass.method()::line;

Push and Pop. Heck in-line assembly code should be allowed. I know were those values I want are, let me get at 'em.
Turn off garbage collection and add free()/delete(). I can do a better job than any compiler/VM.
Pointers. Please, I finally understand these. Give 'em back.
Compile-time platform optimization, header files, #define, #pragma.
More basic types modifiers. Microsoft has the right idea

unsigned long FAR PASCAL *data.

Friend classes, methods, variables, Vector components. I would like to set up a Vector and say that only some of it's member's are available to other classes.

i.e. Vector data = new Vector(3, 1::private, 2::friend, 3::public);

First class methods. Who needs OO after-all.
More passing mechanisms. Add Pass-by-reference, pass-by-value, pass-by-reference-but-look-like-a-value (C++ reference).
Multiple inheritance. Never used it, but someday I may...
Enforced Hungarian notation, which allows us to avoid declaring variables, something I liked in Fortran. Instead of:

UserInfo user = new UserInfo();

cUserInfo_data = new UserInfo();

Implicit typing. The first time a cUserInfo_data variable is used, the default constructor is automatically called.
"Extended Hungarian notation" that could automatically define the class within the variable name.

More fluff words so that the code is more readable, ala COBOL:

add iNative_first to iNative_second giving iNative_result;

</satire>

Conclusion

In an effort to keep the language as simple and clean as possible, the Java language designers purposely choose to provide a simple programming paradigm. Some exceptions were made, notably the inclusion of basic data types which are not objects. But overall the language is simple, elegant and clean.

Programmers used to solving problems using the syntactic features of other languages often pine for the adoption of those features in Java. What they miss is that there is a conceptual cost to adding those features, both in complexity and in paradigm. This is particularly difficult for C++ programmers since the syntax of Java was purposely modelled on the syntax of C++, and this often leads new Java programmers to bring their C++ mindset with them to Java.

To write well architected Java programs a designer must have a good understanding of the language paradigm, remember to think of patterns and not techniques and remember to ask herself "How do I solve this problem?" rather than "How do I apply this technique native to my old language?"