How to Safely Subtype in Java
Learn how to safely subtype in this in-depth Java tutorial.
Join the DZone community and get the full member experience.
Join For FreeAs you might remember, the Liskov Substitution principle is all about promises and contracts. But exactly what promises? It's about guaranteeing safety in subtyping. It means subtypes have to maintain a guarantee that someone could reasonably infer from the supertype. It must have a transitive relation. In mathematics, we say for all a, b, c ∈ X, if aRb and bRc, then aRc. In object-oriented programming, subclassing (now, in this article on subclassing, I mean subtyping) implies subtyping, but not in the right way. We have to make sure we do not break the superclass promises that we inherit and that we do not depend on something that may change out of our control. If it changes, the other object(s) can be impacted (as they are immutable). In fact, subclassing can even be the source of bugs in the project.
Why Safe Subtyping
Exactly, in fact, subclassing is a special type of subtyping that allows subtype reuse of the supertype implementation (with aims of preventing the re-implementing of everything for small modification in the supertype). We can say subclassing is subtyping but not that subtyping is subclassing. Subclassing brings two things: subtyping (polymorphism) and code reuse. Subtyping has the highest impact power; any changes in the public and protected levels of the superclass will impact its subclasses. Subtyping is not an Is-A relationship, sometimes it's an Is-A relationship. Indeed, subtyping is a procedural technique for code reuse and a tool for dynamic polymorphism.
Subclassing only concerns what and how it is accomplished — not what is promised. If you violate the promises of the base class, what will happen? Is there any guarantee that makes you sure it's compatible? Even your compiler will not understand this mistake and you will face a bug in your codes, such as:
class DoubleEndedQueue {
void insertFront(Node node) {
// ...
// insert node infornt of queue
}
void insertEnd(Node node) {
// ...
// insert a node at the end of queue
}
void deleteFront(Node node) {
// ...
// delete the node infront of queue
}
void deleteEnd(Node node) {
// ...
// delete the node at the end of queue
}
}
class Stack extends DoubleEndedQueue {
// ...
}
If the class Stack wants to use Subtyping with aim of code reuse, it may inherit a behavior that violates its principal, such as insertFront
. Let's also see another code example:
public class DataHashSet extends HashSet {
private int addCount = 0;
public function DataHashSet(Collection collection) {
super(collection);
}
public function DataHashSet(int initCapacity, float loadFactor) {
super(initCapacity, loadFactor);
}
public boolean function add(Object object) {
addCount++;
return super.add(object);
}
public boolean function addAll(Collection collection) {
addCount += collection.size();
return super.addAll(collection);
}
public int function getAddCount() {
return addCount;
}
}
I just reimplement HashSet
with the DataHashSet
class in order to keep track of inserts. In fact, DataHashSet
inherits and is a subtype of HashSet
. We can, instead of HashSet
, just pass DataHashSet
(in Java if possible). Also, I do override some of the methods of the base class. Is this legitimate from the Liskov Substitution principle? As I do not make any changes in the behavior of base class and just add a track to insert actions, it seems perfectly legitimate. But, I will argue this is obviously risky subtyping and buggy code. First, we should see what exactly the added method will do. It adds one unit to the related property and calls the parent class method. There is a yo-yo problem with this code. Look at the addAll
method. First, it adds collection size to the related property, then it calls addAll
in the parent, but what exactly does parent addAll
do? It will call the add method several times (loop over the collection), but which add will be called? Will it be the add in the current class or in the parent class? The first one is correct. So, the size of count will be added twice. This will happen once when you call addAll
and second when the parent class calls the add method in the child class; that's why we call it the yo-yo problem. Here is another example that proves subtyping is risky, imagine this example scenario:
class A {
void foo(){
...
this.bar();
...
}
void bar(){
...
}
}
class B extends A {
//override bar
void bar(){
...
}
}
class C {
void bazz(){
B b = new B();
// which bar would be called?
B.foo();
}
}
As you see in, when we call the bazz method, which bar will be called? Of course, the second one — the bar in class B will be called. But what is the problem with it? The problem is that the foo method in class A will not know anything about the override of bar method in class B. Then, your invariants may be violated and it breaks encapsulation, because foo may expect the only behavior of bar method that is in its own class, not something that is overridden. This problem is also called the fragile base-class problem.
A more crucial problem with implementation of subtyping is coupling — the undesirable reliance of one part of a program on another part (tight coupling). Global variables provide the classic example of why strong coupling causes trouble. If you change the type of the global variable, for example, all functions that use the variable (i.e. are coupled to the variable) might be affected, so all this code must be examined, modified, and retested. Moreover, all functions that use the variable are coupled to each other through the variable. That is, one function might incorrectly affect another function's behavior if a variable's value is changed at an awkward time. This problem is particularly hideous in multithreaded programs.
How to Safely Subclass?
The safest way to subclass is to avoid subtyping. If your class is not designed for subclassing, it prevents subclassing by making the constructor private or adding a final keyword to your class. But if we want to have a subclassing in our code then, we might create a wrapper class as a code reuse alternative to subtyping. In this case, we need to have a modular reasoning about code reuse. This involves the ability to reuse code without understanding every implementation details. There are several approaches to deal with — I will describe some of them here. One way is avoiding self-use of overridable functionality by confining overridable functionality into few protected methods. For example, use the language mechanism or specification to prevent overriding of other methods. In the DataHashSet
class, avoid the addAll
method call add. Also, we can minimize the direct impact on other functions of overriding through avoiding the use of overridable method within the class. Let me clarify with the previous example:
class A {
void foo(){
...
this.insideBar();
...
}
void insideBar(){
...
}
void bar(){
this.insideBar();
}
}
class B extends A {
//override bar
void bar(){
...
}
}
class C {
void bazz(){
B b = new B();
B.foo();
}
}
As you see in the code above, I just add the insideBar
method in order to prevent unwanted change cause by overriding and the problem is solved. Most of the times, creating a wrapper class is a good approach to reduce the risk of subclassing. I mean, preferring composition or delegation over Subtyping
There are some places we must avoid subtyping at all costs. If we have more than one way to do subtyping, it's better to do delegation, or when there are some useless methods in the parent class, that means no extend class needs to be used (Liskov substitution principle). The story is the same for classes; I mean when reusing a class, it should not be used whenever the shared class is used.
Delegation Over Subtyping
Subtyping schema includes a class that defines the shape of instances and it acts as a template. Every instance has the behavior of the class and its attributes, but not values since all instances of a class (and its subclasses) use the definitions of attributes stored in the class, and any change to the attribute stored in the class will change all of the instances.
All superclasses and subclasses are combined in one instance; there is linear chain up down (in Java, not in languages like C++, you can have multiple subtypings). However, values are stored in the instance, not in the class, and they are not shared. In subtyping, instances are independent; changing the state (values) of one instance cannot affect any other instances, and each instance has its own parent object.
Delegation means an object uses another object for a method's invocation. In the delegation instance, there is no class to share attributes or behavior, often they called instances without classes. For each class reuse, you can use one instance; imagine you have an area calculator class that accepts different shapes of areas and return exact area. You only need to create an area calculator object and invoke different area type classes. But in subtyping, for each type of area, you must create a sperate object that has its own parent.
If an object delegates a method or variable to a prototype, then any changes to those attributes or their values will affect both the object and the prototype. In this way, objects in a delegation hierarchy may be dependent on one another. In delegation, you need to initiate multi-objects, objects can be from different types and groups (opposite of subtyping). Also, you need to compose instances in the right way to satisfy class needs.
Also, there is no parent class, so you can not use invoked objects attributes directly. In subtyping, the subclass can use parent attributes or methods without implementation, but in the delegation, you must define a method to access that attribute or method.
In delegation, a reuse class can reuse multiple reuse classes. You just have a link to these classes. All of them are in the same instance. But in zubtyping, the reusing class is a subclass of other reusing classes (up to down linear chain). Let's fix the problem in DataHashSet
with the delegation approach:
public class DataHashSet implements Set {
private int addCount = 0;
private Set set;
public
function DataHashSet(Set set) {
This.set = set;
}
public boolean
function add(Object object) {
addCount++;
return This.set.add(object);
}
public boolean
function addAll(Collection collection) {
addCount += collection.size();
return This.set.addAll(collection);
}
public int
function getAddCount() {
return addCount;
}
}
What About Skeletal Implementation?
Skeletal implementation provides advantages of subtyping without loss of flexibility. For each interface, you provide an abstract class that implements interfaces and leaves primitive methods unspecified. It means they leave methods as abstract and to be implemented by a subclass, and it defines very non-primitive methods. Then, the developers who wish to use the interface will implement the interface and then they commonly use the skeletal implementation. It is less flexible than using the wrapper class, such as composition or delegation. In order to make it more flexible, you can use a wrapper class that delegates the call to object of an anonymous subclass of the skeletal implementation.
Opinions expressed by DZone contributors are their own.
Comments