The Secret Life of Objects: Information Hiding
This article looks at object-oriented programming and the complexity surrounding objects by looking at specific examples and types of information hiding.
Join the DZone community and get the full member experience.Join For Free
This will probably be the hardest post I have ever written. It is not easy to explain the basis of object-oriented programming. I think that the first thing we need to do is to define what an object is.
Below, I provided a definition of objects:
The aim of object-oriented #programming is not modeling reality using abstract representations of its component, accidentally called "objects". #OOP aims to organize behaviors and data together in structures, minimizing any dependencies among them.
Messages Are the Core
In the beginning, there was procedural programming. Exponents of such programming paradigms are languages like COBOL, C, PASCAL, and, more recently, Go. In procedural programming, the building blocks are represented by the procedure, which is a function (not mathematically speaking) that takes some input arguments and could return some output values. During its evaluation, a procedure can also have side effects.
Data can have some primitive forms, like
double, or it can be structured into records. A record is a set of correlated data, like a
Rectangle, which contains two primitive
length of type
double. Using C notation, the definition of a rectangle is the following:
Despite inputs and outputs, there is no direct link between data (records) and behaviors (procedures). So, if we want to model all the operations available for a
Rectangle, we have to create many procedures that take it as an input.
As you can see, every procedure insists on the same type of structure: the
Rectangle. Every procedure needs an input and an instance of the structure on which it executes. Moreover, every piece of code that owns an instance of the
Rectangle structure can access its member values without control. There is no concept of restriction or authorization.
The above fact makes the procedures’ definition verbose and maintenance can be very tricky. Tests become very hard to design and execute, because of the lack of information hiding — everything can modify everything.
The primary goal of object-oriented programming is that of binding the behavior (i.e. methods) with the data on which they operate (i.e. attributes). As Alan Kay once said: "[..] it is not even about classes. I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea. The big idea is “messaging.”
The concept of classes allows us to regain the focus on behavior and not on methods inputs. You should not even know the internal representation of a class. You only need its interface. In object-oriented programming, the example below becomes the following class definition. I chose Scala for this example because of its lack of ceremony.
The example given is very trivial. Starting from elements
length, and procedures
area, it was very straight to derive an elegant object-oriented solution. However, is it possible to formalize (and, maybe to automate) the process we just did to define the class
Rectangle? Next, I will try to answer this question.
Information Hiding and Class Definition
We can begin with an unstructured set of procedures:
First of all, we notice that
length parameters are present in both procedures. We might create a type for each parameter, like
Length. However, we immediately understand that the two parameters are always used together in our use cases. There are no procedures that use only one of the two.
So, we decided to create a structure to bind them together. We call this structure
We also understand that a simple structure does not fit our needs.
Rectangle internal should not be changed by anything else other than the two procedures (forget for a moment that tuples are immutable in Scala). Here, we are only interested in the two procedures. So, we restrict the access to rectangle information only to the two procedures.
How can we do that? We should bind information of a rectangle with the behaviors associated with it. We need a class.
Well, taking into consideration the only use cases we have, we could stop here. The solution is already optimal. We hid the information of height and length behind our class; the behavior is the only thing client can access from the outside. However, clients that want to use a rectangle can interact only with the interface of the class
What if we want to support shapes like squares and circles? Well, through the use of interfaces, which are types with pure behavior, object-oriented programming allows our clients to grow more abstract from the concrete implementation of a shape. Next, the above example becomes the following:
As Wikipedia reminds us, "Information hiding is the principle of segregation of the design decisions in a computer program that is most likely to change, thus protecting other parts of the program from extensive modification if the design decision is changed. The protection involves providing a stable interface which protects the remainder of the program from the implementation (the details that are most likely to change)."
Information Hiding and Dependency Degree
As anyone who has followed me for some time already knows, I am a big fan of dependency degree minimization between classes. I have developed a small theoretical framework that allows for calculating the dependency degree of the architecture. This framework is based on the number of dependencies a class has with other classes and the scope of these dependencies, concerning a class life cycle.
I have already used my framework in other circumstances, like when I spoke about the Single-Responsibility Principle. This time, I will try to use it to sketch the process we just analyzed. The goal of this is to aggregate information and related behaviors inside the same class, hiding the former to the clients of the class. I will try to answer the question: why are height and length collapsed inside one single type (which is incidentally called
Just as a recap, I defined in the post Dependency the degree of dependency between classes
φSA|B is the quantity of code (i.e. SLOC) that is shared between types
B. φStotB is the total number of code (i.e. SLOC) of the
B class. Finally, εA→B is a factor between 0 and 1. And, the wider the scope between
B , the greater the factor.
Length had each been defined as dedicated types, then the client
C that needed to use a rectangle would always have to use both types. Moreover, the
Rectangle type would still have been necessary to put the methods
scale. Using this configuration,
Length are said to be tightly coupled, because they are always used together.
The degree of dependency of class
C would be very high, using the above definition. Also, the degree of dependency of class
Rectangle would be high, due to references to
Length in the methods
It is likely that many class configurations can reduce the degree of dependency of the above example. However, the minimization of the value δtotC can be reached by the solution we gave in the previous paragraph.
In some ways, we started to trace a new way of design architecture, reducing the art of desig to find an architecture that minimizes a mathematical function on the degree of dependency. Nice!.
To sum this all up using a toy example, we tried to sketch the informal process that should be used for defined types and classes. We started from the lacks we find in procedural programming paradigm, and we ended up trying to create a mathematical formulation of this process. All the ideas we used are tightly related to the basic concept of information hiding that we confirmed to be one of the most essential concepts in object-oriented programming.
This post is just my opinion. It was not my intention to belittle procedural programming but to celebrate object-oriented programming.
Published at DZone with permission of Riccardo Cardin, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.