Thomas – The Ogre of Athens (1956) Image from clickatlife.gr
Introduction
An object’s state, describes its current condition. If object was a human, this could be their emotional state (e.g happy, sad, angry). To think a bit further, the height value in an object’s field is not a state, but the height’s value help us to define it’s height category (e.g tall, short, average), that is actually the state. And it is the state, based on the context we, as society, have set on defining when someone is considered tall or not.
An OO software system is an objects graph. Objects exchange messages, and based on the message content an object changes it’s state or not. Other objects do not directly read or write the state of another object. The state itself, must never leak out of the object to another. In short, objects do not share any references to state, but instead only report to each other with copies of stateful data. In this fashion, each object is the sole manager of its own state.
But this only half of the story. If we are too strict with the state management, we will not finally have a graph of objects, but rather a strict flow of composition (object B has no meaning or purpose in the system without A) or aggregation (B exists independently (conceptually) from A).
The state will be concentrated to a root object, itself composed by a number of stateful objects, which in turn may be composed of stateful objects. The objects of this hierarchy may pass messages to their immediate children, but not to their ancestors, siblings, or further descendants.
How fragile is the object’s state?
I am sure, that all of us, have worked with source code of a flexible or nonexistent hierarchical pattern at all. We need indeed to be (maybe more) flexible in our everyday work, and the source code is a part of it, but we might end up with objects leaking state.
But a wonderful clean hierarchy is needed when the specifications ask for it. While we organize our objects into a specific hierarchy, the state management is enforced to follow the same pattern. If our business logic is something trivial, probably we will have no trouble at all.
How do we handle use cases that involve multiple objects that are not directly related? One solution could be a common ancestor, other possible solutions is to use the Mediator pattern of the actor model. The larger the software, the more complexity there is, so we have to be really careful with the objects state management. Shallow hierarchies as possible, allow us to successfully manage state [2]. For more information please read here.
Object itself has no fields or properties that can be changed. This object will remain totally unchanged through its life cycle. For reference types the field must be read only and the type must also meet the immutable guidelines. Types that meet the immutable guidelines, are inherently thread safe. They are not subject to race conditions because they cannot be changed and thus viewed differently between threads [1].
Jared Parsons
Shallow Immutable
The direct contents of this object must be immutable in the sense that they cannot be changed to point to another object. However there is no guarantee that all of the fields are themselves immutable. All of the fields in the object must be read only. For primitive values this is enough to guarantee them meeting the Immutable guidelines and hence Shallow Immutable. For reference types this will ensure they will point to the same object thus doing all that is needed to meet the Shallow Immutable Guarantee. Types with this guarantee are thread safe to a degree. They are thread safe as long as you are accessing the fields with meet the Immutable guarantee. You can also access the references which are read only as long as you do not call any instance methods. For instance a Null check is thread safe [1].
Jared Parsons
Shallow Safe Immutable
Slightly stronger guarantee than Shallow Immutable. For all fields that are read only but not immutable, the type must be thread safe. Types which meet this guarantee are thread safe also to a degree. They are as thread safe as the fields which are guaranteed to be thread safe [1].
Jared Parsons
Benefits and a drawback of immutability
No matter if we decide to use immutability a lot, or enforce it, or use it partially, the goal has always to be the minimization of state usage. Immutability is not a binary decision. It depends on the requirements we have in hand. There is not a specific decision threshold on picking strict immutability or not. The benefits of immutable objects could be summarized to the following (probably incomplete list):
Immutable objects always return new objects and not copies
Immutable objects and caching. The cached object is the same with the one in the execution flow. It doesn’t change.
* An immutable object throws an exception, it will never result with the object left in an indeterminate state. This term was coined by Joshua Bloch.
However, if a modified version of an immutable object is needed, we have to suffer the penalty of creating a new object. This must be handled with care in terms of Garbage Collection (GC).
Examples of immutability
Instead of writing numerous examples, I provide you Jon Skeet‘s excellent presentation about immutability. He explains, in details with source code examples, whatever is mentioned, more or less, in this article.
And a bonus video regarding the fundamentals and how data structures affect the way we develop. They effect immutability as part of the big picture.
Conclusions
Immutability is like a big ship. Skills are needed to pilot it, but you have to be careful. A good solid knowledge of OOP is necessary. Try to keep the objects graphs as small as possible, and minimize the dependencies between the objects. We will discuss about the degree of dependency in a future article.
The next episode
In the next episode, we will discuss again about the object’s state and its (im)mutability and its relationship with Inversion of Control and especially with Dependency Injection.
Functional programming is how programming should be. We want behaviours (functionalities), that receive an input and produce an output. Simple as that. Of course we might need to process again and again the data in hand, but this is also part of the expected behaviour: one’s function output is the other one’s input.
The ultimate goal is to deliver a software product built with reliable code, and the best way to do that is simplicity. Therefore programmers’ main responsibility is to to reduce code complexity. The whole picture is that OOP doesn’t deliver as excepted, nor in code quality nor in deadlines. It looks good in diagrams, but once the complexity starts increasing ,things, slower or faster, are getting out of hand. Especially when the state is mutable and shared, then a chaos is on the loose. Even full test coverage worth nothing, if the source code is complex and not maintainable.
In the seventies, the idea of “real OOP” was hugely powerful, but what was implemented was far from a complete set of ideas, especially with regard to scaling, networking, etc. How dynamic objects intertwined with ontologies and inference was explored by Goldstein and Bobrow at Parc. Their four papers on PIE and their implementation were the best extensions ever done to Smalltalk, and two of the ideas transcended the Smalltalk structure and deserved to be the start of a new language, and perhaps have a new term coined for it.
Alan Kay’s original idea about OOP
The term “Object Oriented Programming” was first coined in 1996 by Alan Kay. Based on his answers in 2003, via email Stefan Ram’s (a German computer science professor in Berlin at that time), his ideas on OOP were completely different from what we have today in our hands as OOP languages.
At Utah sometime after Nov 66 when, influenced by Sketchpad, Simula, the design for the ARPAnet, the Burroughs B5000, and my background in Biology and Mathematics, I thought of an architecture for programming. It was probably in 1967 when someone asked me what I was doing, and I said: “It’s object-oriented programming”.
– I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful).
– I wanted to get rid of data. I realized that the cell/whole-computer metaphor would get rid of data, and that “<-” would be just another message token (it took me quite a while to think this out because I really thought of all these symbols as names for functions and procedures.
– My math background made me realize that each object could have several algebras associated with it, and there could be families of these, and that these would be very very useful.
Alan Kay answering to Paul Ram in 2003
So far, based on Alan Kay’s answers, he focuses on cells (objects) exchanging messages to each other. His true goal was messaging.
The term “polymorphism” was imposed much later (I think by Peter Wegner) and it isn’t quite valid, since it really comes from the nomenclature of functions, and I wanted quite a bit more than functions. I made up a term “genericity” for dealing with generic behaviors in a quasi-algebraic form. […]
OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I’m not aware of them.
Alan Kay answering to Paul Ram in 2003
Inheritance and polymorphism are not even mentioned! In the end, according to Alan Kay, the three pillars of OOP are:
Message passing
Encapsulation
Dynamic binding
Combining message passing and encapsulation we try to achieve the following:
Stop sharing mutable state among objects, by encapsulating it and allow only local state changes. State changes are at a local, cellular level rather than exposed to shared access.
A messaging API is the only way the objects communicate. Thus the objects are decoupled. The messages sender is loosely or not coupled at all to the message receiver.
Resilience and adaptability to changes at runtime via late binding.
[…] the whole point of OOP is not to have to worry about what is inside an object. Objects made on different machines and with different languages should be able to talk to each other […]
Alan Kay – The Early History Of Smalltalk
This sentence actually is talking about distributed and concurrent systems. Objects hide their states from each other and they just communicate (“talk to each other”) by exchanging messages. In simple words, objects should be able to broadcast that they did things (changed their state actually) and the other objects can ignore them or respond. This concept reminds of agents modelling or even actors. The key point that can improve the isolation among objects, is that the receiver is free to ignore any messages it doesn’t understand or care about.
Finally, let’s remember one more of Alan Kay’s quotes
I made up the term object-oriented, and I can tell you I did not have C++ in mind.
Alan Kay
It was in the eighties that “object-oriented languages” started to appear. C++ was part of a set of ideas starting around 1979 by Bjarne Stroustrup. C++ was designed to provide Simula’s facilities for program organization together with C’s efficiency and flexibility for systems programming. His approach was via “Abstract Data Types”, and this is the way “classes” in C++ are generally used. C++ was a pre-processor to C language. “Classes” were program code structuring conventions but didn’t show up as objects during run time.
Hence the quote, as Alan Kay states in Quora, which is not so much about C++ per se but about the term that we had been using to label a particular approach to program language and systems design.
OOP and human cognition
In 1995 a paper was published by Bill Curtis under the name ” Objects of Our Desire: Empirical Research on Object-Oriented Development“. Among others, there is the following sentence:
In careful experiments, Gentner (1981; Gentner & France, 1988) showed that, when people are asked to repair a simple sentence with an anomalous subject-verb combination, they almost always change the verb and leave the noun as it is, independent of their relative positions. This suggests that people take the noun (i.e. the object) as the basic reference point. Models based on objects may be superior to models based on other primitives, such as behaviours.
Objects of Our Desire: Empirical Research on Object-Oriented Development, Bill Curtis
So a paper published in the nineties cites experiments that were run in the eighties, to support OOP concept. Well, in the nineties that might made sense, based on how the software industry was at that time, but not today. Today’s software is moving towards serverless applications, which are functions as a service, rather than to complicated objects communicating to each other.
The line of business software in our times is so complex, that OOP + TDD, OOP + DDD or OOP + BDD are concepts that programmers still struggle with. What is the right number of objects? How deep the granularity of objects should be? How mutable the objects should be? What is the right architecture to follow? Although there are tons of books and articles about those issues, software projects fail, due to complexity.
Additionally Bill Curtis paper includes the following:
Miller (1991) described how nouns and verbs differ in their cognitive organizational form. Nouns – and hence the concepts associated with them – tend to be organized into hierarchically structured taxonomies, with class inclusion and part-whole relations as the most common linkages. These are also, of course, the most common relations in OO representations.
In human cognition, these hierarchies tend to be fairly deep for nouns – often six to seven layers. These hierarchies support a variety of important cognitive behaviours, including the inheritance of properties from super ordinate classes. In contrast, verbs tend to be organized in very flat and bushy structures. This again suggest a central place for objects, in that building inheritance hierarchies will mirror the way humans represent natural categories only if the basic building blocks are objects rather than processes or behaviours.
Objects of Our Desire: Empirical Research on Object-Oriented Development, Bill Curtis
So through linguistics principles, the paper supports OOP, that is objects (nouns), verbs (methods) and the hierarchy among them. But a thing missing here, is what about those programmers, whose native tongue is not English? How their brains work? Can they adapt easily, to the OOP logic or not? Even today sometimes I see variables named in other languages that English.
You will not find a single medical article that denotes that the human brain thinks, organizes, structures based on objects. We carry a “todo list”, not an “item hierarchy list”. Human brains can only hold about five items at a time in working memory. It is much easier to explain a piece of code based on what it does, rather than based on what variables change around the source code. Each language has a set or rules to constraint you, in order to speak and write correctly, it is called grammar! On the other hand in OOP programming you have so many options to solve the same problem, that in the end you can just throw any “grammar” out of the window.
Additionally, OOP code is non- deterministic. You can verify that by installing a cyclomatic complexity extension to your IDE and run it. Dependencies, null checking, type checking, conditional statements, all of them combined produce more outputs than expected. Let’s not forget Mock object in unit testing, you have to predefine its behaviour. So even if you have an a grammar, a structure of your objects, there is no guarantee that the functionalities implementation is going to be according to the grammar.
Finally, we have dependencies hell. And it”s only about nuget packages or maven dependencies. It’s the source code’s internal hierarchy. Inheritance, methods, constructors parameters, Law of Demeter, etc. So how nouns and verbs and objects hierarchy are equal to simple code without extra complexity is still a mystery.
Why Functional Programming (FP)?
Functional programming is a programming paradigm: a different way of thinking about programs than the mainstream, imperative paradigm you’re probably used to. FP is based on lambda calculus. Functions tend to provide a level of code modularity and reusability It manages nullability much better, and gives us a better way of error handling.
FP provides the following:
Power.—This simply means that you can get more done with less code. FP raises the level of abstraction, allowing you to write high-level code while freeing you from low-level technicalities that add complexity but no value.
Safety. This is especially true when dealing with concurrency. A program written in the imperative style may work well in a single-threaded implementation but cause all sorts of bugs when concurrency comes in. Functional code offers much better guarantees in concurrent scenarios cause of immutability, so it’s only natural that we’re seeing a surge of interest in FP in the era of multi core processors.
Clarity. We spend more time maintaining and consuming existing code than writing new code, so it’s important that our code be clear and intention-revealing.
So how functional a language is C#? Functions are first-class values in C#. C# had support for functions as first-class values from the earliest version of the language through the Delegate type, and the subsequent introduction of lambda expressions made the syntactic support even better. There are some quirks and limitations, but we will discuss about them in time.
Today we have LINQ. Language-Integrated Query (LINQ) is the name for a set of technologies based on the integration of query capabilities directly into the C# language. Traditionally, queries against data are expressed as simple strings without type checking at compile time or IntelliSense support. With LINQ, a query is a first-class language construct, just like classes, methods, events. You write queries against strongly typed collections of objects by using language keywords and familiar operators. The LINQ family of technologies provides a consistent query experience for objects (LINQ to Objects), relational databases (LINQ to SQL), and XML (LINQ to XML).
Query expressions are written in a declarative query syntax. By using query syntax, you can perform filtering, ordering, and grouping operations on data sources with a minimum of code. You use the same basic query expression patterns to query and transform data in SQL databases, ADO .NET Datasets, XML documents and streams, and .NET collections.
The disadvantage in the C# + FP try, is that everything is mutable by default, and the programmer has to put in a substantial amount of effort to achieve immutability. Fields and variables must explicitly be marked read-only to prevent mutation. (Compare this to F#, where variables are immutable by default and must explicitly be marked mutable to allow mutation.) Finally, collections in the framework are mutable, but a solid library of immutable collections is available.
To highlight the difference between are in OOP and FP, I provide an example: You run a company and you just decided to give all your employees a $10,000.00 raise.
OOP (imperative way)
FP
1. Create Employee class which initializes with name and salary, has a change salary instance method
2. Create instances of employees
3. Use the each method to change salary attribute of employees by +10,000
1. Create employees array, which is an array of arrays with name and corresponding salary
2. Create a change_salary function which returns a copy of a single employee with the salary field updated
3. Create a change_salaries function which maps through the employee array and delegates the calculation of the new salary to change_salary
The FP approach uses pure functions and adheres to immutability by using map With OOP, we cannot easily identify if the object has had the function called on it unless we start from the beginning and track if this has happened, whereas in FP, the object itself is now a new object, which makes it considerably easier to know what changes have been made.
FP leans heavily on methods that do one small part of a larger job, delegating the details to other methods. This combining of small methods into a larger task is composition. In our example, change_salaries has a single job: call change_salary for each employee in the employees array and return those values as a new array. change_salary also has one job: return a copy of a single employee with the salary field updated. change_salaries delegates the calculation of the new salary to change_salary, allowing change_salaries to focus entirely on handling the set of employees and change_salary to focus on updating a single employee.
Conclusions
I believe that anyone of you have understood, that the main key words are code simplicity, state immutability, messaging. From distributed mutable state around the source code (objects), to code organized by expected behaviours (functions).
FP is a programming parading as OOP is. OOP is alive and it will be for next years. But can we rely on it anymore? After 10 years as a software developer, I believe not anymore. Maybe I am wrong!
But instead of asking a better OOP language, I try to smoothly move to FP. Unfortunately I can’t completely move away from C#, due to business restrictions, but I do my best to find a better alternative.
The next episode
In the next episode, the topic is “OOP today” and an analysis about objects state.