Referential Transparency in Scala Pt. I - Pure functions

In this post we are going to learn one of the core concepts of Functional Programming (FP), Referential Transparency (RT), and how to carry it out in our Scala applications with the help of pure functions. We are going to see, through some examples, the different problems that could arise while writing our applications and how to handle them.

Just for letting you know, this is the first post of the series Writing Functional Applications in Scala. My idea is to keep writing more posts about different topics that will help you write applications in a better, or even pure, functional style with Scala. It is important to highlight that some basic knowledge about Scala would be helpful to understand the examples written below. However, you can still learn about FP and its concepts as this is something that you can apply to any programming language.

With everything said, are you ready? Then, let’s get started! 💪🏼

What is Referential Transparency?
Side-effects
Pure functions
Conclusions
References

What is Referential Transparency?

An expression is referentially transparent if we can replace it with its value without changing the application’s behaviour.

Good, we have the definition, but… it is always better to illustrate a new concept with some examples.

Imagine that we have the following function:

def divide(x: Int, y: Int): Int = x / y

divide is a simple function, right? But it will help us illustrate the previous concept.

Let’s have a look at the following application:

val a = divide(4, 2)
val b = divide(6, 3)
val c = divide(8, 4)

val result = a + b + c

println(result)

output:
6

We have a little application that is using the previous function to calculate some result. According to the previous definition, if we can replace every function invocation with its value; that function will be referentially transparent.

Let’s check then if divide is meets that rule:

val a = 2 // divide(4, 2)
val b = 2 // divide(6, 3)
val c = 2 // divide(8, 4)

val result = a + b + c

println(result)

output:
6

So far so good. It worked as expected! We have exactly the same behaviour. We proved that divide is referentially transparent.

But… are we totally sure? Let’s write one more example:

val a = divide(4, 2)
val b = divide(6, 0) // Dividing by 0
val c = divide(8, 4)

val result = a + b + c

println(result)

output:
Exception in thread "main" java.lang.ArithmeticException: / by zero

Oh-oh! We have a problem here! Our function now does not even return a value, it is throwing an exception! It makes sense that our function fails, because we were trying to divide by 0 and that is not supported if we work with Int type in Scala; but raising an exception is not the best way to fail, as we will see later.

If we try to replace the function with its value, we clearly see that we do not know how to do it:

val a = 2
val b = ¯\_(ツ)_/¯
val c = 2

val result = a + b + c

println(result)

Even if we throw an ArithmeticException, it is not going to be the same exception as the previous one. So, in the end, our application is not having the same behaviour.

This proves that, in fact, divide is not referentially transparent. Moreover, it is throwing an exception, so that means that it has side-effects.

Side-effects

We can say that a function has a side-effect if it is modifying the application’s external state besides calculating the output. In this case, it is raising an exception.

Let’s try to fix it. As we know now that we will have problems if we divide by 0, we are going to check that case:

def divide(x: Int, y: Int): Option[Int] =
  if (y != 0) Some(x / y)
  else None

Now, we have to rewrite our application:

val x = divide(4, 2)
val y = divide(6, 0)
val z = divide(8, 4)

val result = for {
  a <- x
  b <- y
  c <- z
} yield a + b + c

println(result)

output:
None

Good! Now we do not have any exception. Let’s see if we can replace the function with its value now:

val x = Some(2) // divide(4, 2)
val y = None // divide(6, 0)
val z = Some(2) // divide(8, 4)

val result = for {
  a <- x
  b <- y
  c <- z
} yield a + b + c

println(result)

output:
None

Yeah! It worked! Now, divide is referentially transparent and, the really good point here, is that we are working with values instead of exceptions. We will see that FP is all about values and how our applications interact with them describing what we want to do.

In this example, we learnt how to avoid working with exceptions in our application as they do not allow us to write referentially transparent functions. We used Option to handle the error case and return None, as we do not have a valid value. This, apart from helping us to achieve what we wanted, is even more performant, as throwing exceptions is very expensive.

Let’s now change a little the divide function again because we want to add some debug traces:

def divide(x: Int, y: Int): Option[Int] =
  if (y != 0) {
    println(s"$x/$y")
    Some(x / y)
  } else {
    println("(╯°□°)╯︵ ┻━┻")
    None
  }

And now, we execute the first application again:

val x = divide(4, 2)
val y = divide(6, 3)
val z = divide(8, 4)

val result = for {
  a <- x
  b <- y
  c <- z
} yield a + b + c

result.foreach(println)

output:
4/2
6/3
8/4
6

Our output has changed. Apart from printing the aggregated result, we are now also printing debug information.

Just as before, we are going to use the values instead. What will happen if we do it?

val x = Some(2) // divide(4, 2)
val y = Some(2) // divide(6, 3)
val z = Some(2) // divide(8, 4)

val result = for {
  a <- x
  b <- y
  c <- z
} yield a + b + c

result.foreach(println)

output:
6

Even though we have had the same final value for our result, our application’s behaviour is not the same because we did not print the debug traces. Also, we have again side-effects in our function, as we are printing in the console, an I/O operation.

Well, that totally makes sense, but we forgot something really important: we should avoid side-effects. In fact, we should write pure functions.

Pure functions

We say that a function is a pure function if the result is the same given the same input, not varying the result depending on external factors such as external state or I/O operations, and if the function does not change the application’s external status, does not have side-effects, for each one of the executions.

With this definition, we can see that the latest divide function is not pure because we are using an I/O operation, which is a side-effect; therefore, it is not referentially transparent and we were not able to substitute it keeping the same application’s behaviour. The same scenario that we had with the exception.

This is something really hard to achieve because, when we write applications, we want them to have side-effects as we want to modify external systems, write logs, access to databases or have metrics in place.

But the real problem does not lay in having side-effects, it is how we convert those side-effects into values so our functions can be pure.

If you remember, we had already done that when we converted the exception into a handled error DivisionByZero. But, in this case, we need to convert the println result, the I/O operation, into a value.

So… how are we going to do that? 🤔

Conclusions

We have seen how Referential Transparency is tightly related to pure functions and we have checked some examples to clarify them.

In those examples, we faced some problems that were stopping us from having pure functional code: side-effects. We tackled the exception scenario but we have still one missing: I/O operations.

We will cover those in the next posts.

References

Published 7 Jun 2020