In Python, a toaster can change the color of your hair, and that’s a big deal
Finally, truly understand global variables and side effects
If you want to write clean code that will be easy to use for others, you must have a good understanding of what side effects are and how to deal with them.
What is a function?
If you’ve been using Python or pretty much any programming language born after the 50’, you’ve probably already heard of functions, actually you’ve probably already used many functions.
The concept of function is quite simple once you understand it. A function is just a black box in which you put some data and that will give you some result(s). But in that definition, we assume something that is absolutely not obvious.
If functions were just black boxes, they could only have an influence on what you put inside of them.
For example, imagine you are going to eat breakfast and crave for a French toast. What would you do?
Personally, I would put some bread in the toaster, and a few minutes later, it would give me my toast, ready to be eaten.
If we translate this process in Python, we get a piece of code just like this one:
french_toast = toaster(bread)
In that very situation, the only thing that the toaster can change is the “state” of your bread.
I bet you wouldn’t understand if, by turning on the toaster, your hair color changed to orange (assuming your hair is not already orange).
But Python is not real life
What is disturbing, is that, in Python (and in many other programming languages), things are not that simple.
In Python, the following “code” would be absolutely valid:
def toaster(bread) -> toast :
...
make_your_hair_blue()
...
return toast
And so, calling the toaster
function could result in unexpected results.
But how is that?
That might not be obvious, but this make_your_hair_blue
function carries a lot of interesting stuff to discuss.
Imagine if you had a “hair coloring” machine. In real life, to use it, you would have to put your hair inside of it or next to it at least.
But what we’re seeing here, is that make_your_hair_blue
doesn’t need that, we didn’t give any input to the function, so technically, the machine shouldn’t have any access to the hair color. And yet the color has changed.
In other words, our black box (the toaster) can change something outside itself.
Let’s make our code a little more concrete to understand more easily what’s going on.
hair_color = "brown"
bread = "Not toasted"
def make_your_hair_blue():
global hair_color
hair_color = "blue"
def toaster(bread) -> bread :
print("Toasting the bread")
make_your_hair_blue()
return "Toasted"
toasted_bread = toaster(bread)
print(toasted_bread) # Toasted
print(hair_color) # blue
This code is pretty basic, nothing really special. The point I want to bring your attention to is the first line of the make_your_hair_blue
function, we can see the following line:
global hair_color
And that means a lot.
The global keyword (or how to make things more complicated than they already are)
What makes this line interesting is the presence of the global
keyword.
Its role is quite simple. It simply tells Python that we will have a variable called hair_color
in our function. Not only that, but it is also telling Python that this variable shouldn’t be a new one, it should actually correspond to the variable already existing outside the function.
So when the value of hair_color
is changed, it is actually the value of the outside variable that we change.
That implies something very interesting, it means that the function changes something that is not in its scope. As we said, using our black box actually changes something outside the box.
In computer science, we call this a “side effect” and our function has indeed a side effect.
Side effects
That is a big deal because that means if you write a module containing a function that has side effects and someone else uses the function of your module, they have no idea that there is a side effect, and that can cause many troubles to them.
So what do we do? Should we completely avoid side effects in our code?
Well nobody can give a strict answer.
An obvious way of getting rid of this problem, would simply be to remove completely the “global” keyword, and forbid side effects.
The presence or not of side effects in a programming language has a great influence on the paradigm of this language.
For example, in imperative programming side effects are often used, while declarative and functional programming tend to avoid it.
But don’t worry, it is possible to use side effects in a less dangerous way. There are different ways, but the one I am going to show you requires the use of a class and dunder methods. (If you are not familiar with it, check out this article of mine on the topic)
Using Object-Oriented Programming to solve our side effects problem?
Object-Oriented Programming takes a different approach. The main idea is that a method can have side effects, but it should only have the ability to change the state of the current instance on which you are calling the method.
Let me develop on that a bit. OOP uses objects to represent what is involved in your program. You should see pretty much everything you’re working with as an object.
For example, in our toaster story: the toaster is an object, the bread, the toast, those are objects, even you and your hair are objects.
So now we have conceptualized our situation with objects, here is what we could get:
class Human:
def __init__(self):
self.hair_color = "Brown"
def make_hair_blue(self):
self.hair_color = "Blue"
class Bread:
def __init__(self):
self.toasted = False
def toast(self):
self.toasted = True
bread = Bread()
you = Human()
print(you.hair_color) # Brown
print(bread.toasted) # False
bread.toast()
print(you.hair_color) # Brown
print(bread.toasted) # True
In this example, you can see I chose to have two classes, the Human
class (that’s you), and the Bread
class (that’s your future delicious French toast). As you can see, to toast the bread, I had to use a method of the bread, the toast
method.
In this way, the only thing that accesses the state of the bread, is the bread itself, and in the same way, the only thing that can change the color of your hair, is you.
What that means is that each class owns the data it works with.
Why are methods better than direct assignment?
But I could have done it differently. What if I write this instead:
# ...
bread = Bread()
you = Human()
print(you.hair_color) # Brown
print(bread.toasted) # False
# replace bread.toast() with
bread.toasted = True
print(you.hair_color) # Brown
print(bread.toasted) # True
If you try to run it, that will work. This is valid code in Python, and nothing stops you from doing that.
We have achieved the same result, but without using an additional method, so why bother to write a method?
Our example was for a simple situation. Now let’s say you’re a bit lazy, and want to know as soon as your toast is ready to be eaten.
Here is what you could write:
class Bread:
def __init__(self):
self.toasted = False
def toast(self):
self.toasted = True
print("The toast is ready!")
That way, the only thing to change is the class definition, there is nothing to change in the rest of your code:
# ...
bread = Bread()
print(bread.toasted) # False
bread.toast() # The toast is ready!
print(bread.toasted) # True
Whereas if you didn’t write a method, you would need to write that:
# ...
bread = Bread()
print(bread.toasted) # False
bread.toasted = True
print("The toast is ready!")
print(bread.toasted) # True
So the first reason why you should use method is to avoid code repetition. Just respecting the DRY principle (Don’t Repeat Yourself).
Encapsulation
Now, consider using a class from a module you just installed. This module is supposed to toast your bread and has many complex functions to achieve that precise goal.
If you’re lazy (we said you were), you probably don’t want to bother with the complicated details of the implementation, but rather finally get your bread toasted.
In that scenario, having a toast
method is very important. Because you don’t want to have to mess up with turn_on_toaster
or make_sure_toast_is_ready
functions.
That principle of hiding the complex details of implementation to the end user is called encapsulation.
In many programming languages, it is up to the module’s author to decide whether the user can access the data in the program.
C++ for example let you choose between public
and private
to define the “accessibility” of your class’s method and variables.
In Python tough, there is no way to block the access. To make up for this lack, there is a convention in the Python community, that if you prefix your variable name with a _
you don’t want the end user to access this variable.
Did we solve anything?
But if you test it, you will probably realize that even though we notified the user that the argument, there is nothing preventing someone to call the make_hair_blue
method:
class Bread:
def __init__(self):
self.toasted = False
def toast(self):
you.make_hair_blue()
self.toasted = True
So actually we didn’t solve anything, the toaster can still change your hair color?
Indeed, that is an interesting lesson to learn: in Python, you can not really prevent the user of your code to do something, your only chance is to give them the information not to do it and hope they listen to you.
In that case, it is easy to understand that calling the make_hair_blue
in the toaster
method is really weird. It’s then up to the user not to write such a nonsense.
The use of a class is often a good way to have a clear understanding of whose data you are accessing. Because, as we previously said, a class is responsible for the data it uses.
What if I have to use the global keyword?
There are times when you can hardly avoid using global variables, but that shouldn’t prevent you from writing “good” code.
Even in these situations, you should absolutely avoid the global
keyword. To be honest, you can almost always avoid using it. What you should do instead is pass the global variable as a parameter of the function.
But an even more OOP-friendly way of doing it would be to have a global class that can have all of your global variables as attributes.
Imagine you have a simple program that reads data from a stream and returns processed data:
bytes_processed = 0
input_stream = connectToInputStream()
output_stream = connectToOutputStream()
def read_data():
return input_stream.read(256)
def write_data(data):
output_stream.write(data)
def process_data(data):
global bytes_processed
...
bytes_processed += 1
return new_data
while True:
data = read_data()
data = process_data(data)
write_data(data)
The problem is that we use the global
keyword, and that is a sign of a poorly designed program. Instead, we could encapsulate everything in a “global” class:
class DataProcessingServer:
def __init__(self, in_stream, out_stream):
self.bytes_processed = 0
self.input_stream = in_stream
self.output_stream = out_stream
def read_data(self):
return self.input_stream.read(256)
def write_data(self, data):
self.output_stream.write(data)
def process_data(self, data):
...
self.bytes_processed += 1
return new_data
def run(self):
while True:
data = self.read_data()
data = self.process_data(data)
self.write_data(data)
server = DataProcessingServer(
connectToInputStream(),
connectToOutputStream()
)
server.run()
I have rewritten the program to adopt a more OOP-friendly way of modeling the situation. Now, all the data that is used by the DataProcessingServer
is actually owned by the class and if someone accesses an attribute of the server, he will know that the data he is using belongs to the server.
Thanks for reading, and see you soon!
Become a medium member to read all my next stories as well as others’!
See you in the next story 👋
More content at PlainEnglish.io. Sign up for our free weekly newsletter. Follow us on Twitter, LinkedIn, YouTube, and Discord.