Optimisation
Is Julia Really Faster than Python and Numpy?
The speed of C with the simplicity of Python

Python, along with the numpy/pandas libraries, has essentially become the language of choice for the data science profession (…I’ll add a quick nod to R here).
However, it is well known that Python, although fast and easy to implement, is a slow language. Hence the need for excellent libraries like numpy to increase efficiency…but what if there was a better alternative?
Julia claims to be at least as easy and intuitive to use as Python, whilst being significantly faster to execute. Let’s put that claim to the test…
What is Julia?
Just in case you have no idea what Julia is, here is a quick primer.
Julia is an open source language that is dynamically typed, intuitive, and easy to use like Python, but with the speed of execution of a language like C.
It has been around approximately 10 years (born in 2012), so it is a relatively new language. However, it is at a stage of maturity where you wouldn’t call it a fad.
The original creators of the language are active in a relevant field of work:
For the work we do — scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing — …
- julialang.org — Jeff Bezanson, Stefan Karpinski, Viral B. Shah, Alan Edelman
All in all, it is a modern language specifically designed to be used in the field of data science. The aims of the creators themselves tell you a great deal:
We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy. We want it interactive and we want it compiled.
(Did we mention it should be as fast as C?)
- julialang.org — Jeff Bezanson, Stefan Karpinski, Viral B. Shah, Alan Edelman
Sounds quite exciting right?
Incidentally, if you want an idea as to how Python and Julia compare side-by-side in terms of syntax and general usage, then you may want to check out my other article which takes an in depth look at running a deep learning image classification problem using both Julia (Flux) and Python (TensorFlow):
The basis of the speed test
I have written an article previously looking at vectorization using the numpy library in Python:
The speed test that will be conducted in this article will basically be an extension / comparison to this article.
How will the test work?
A comparison will be made between the speed of execution of a simple mathematical statement:
Function 1 — Simple summation
#Python
def sum_nums(a, b):
return a + b
#Julia
function sum_nums(x,y)
x + y
end
and more complicated conditional statement:
Function 2 — More complex (logic and arithmetic)
#Python
def categorise(a, b):
if a < 0:
return a * 2 + b
elif b < 0:
return a + 2 * b
else:
return None
#Julia
function categorise(a, b)::Float32
if a < 0
return a * 2 + b
elseif b < 0
return a + 2 * b
else
return 0
end
end
When run through the following methods:
- Python- pandas.itertuples()
- Python- list comprehension
- Python- numpy.vectorize()
- Python- native pandas method
- Python- native numpy method
- Julia- native method
The notebooks for this article

The previous article included a Jupyter notebook written in Python. I have taken this notebook (unchanged) from the previous article, and re-run it using a deepnote instance which utilises Python 3.10.
The deepnote instance for both the Python runs, and the Julia runs, has the exact same basic CPU instance (i.e. hardware). This ensures that the timed results included with this article are directly comparable.
Note: I have made sure to include the CPU information in each notebook so you can see what exact hardware was used, and that they were in fact exactly the same.
Running the Julia notebook
It is worth noting that whether you wish to use the notebooks in deepnote as I have, or in colab, you will need to setup Julia in the respective environments. This is mainly because most public online instances are currently setup for Python only (at least out of the box).
Environment Setup
Deepnote
As deepnote utilises docker instances, you can very easily setup a ‘local’ dockerfile to contain the install instructions for Julia. This means you don’t have to pollute the Jupyter notebook with install code, as you will have to do in Colab.
In the environment section select “Local ./Dockerfile”. This will open the actual Dockerfile where you should add the following:
FROM deepnote/python:3.10
RUN wget https://julialang-s3.julialang.org/bin/linux/x64/1.8/julia-1.8.2-linux-x86_64.tar.gz && \
tar -xvzf julia-1.8.2-linux-x86_64.tar.gz && \
mv julia-1.8.2 /usr/lib/ && \
ln -s /usr/lib/julia-1.8.2/bin/julia /usr/bin/julia && \
rm julia-1.8.2-linux-x86_64.tar.gz && \
julia -e "using Pkg;pkg\"add IJulia\""
ENV DEFAULT_KERNEL_NAME "julia-1.8"
You can update the above to the latest Julia version from this page, but at the time of writing 1.8.2 is the latest version.
Colab
For colab all the download and install code will have to be included in the notebook itself, as well as refreshing the page once the install code has run.
Fortunately, Aurélien Geron (…that name will be familiar to a few here I recon) has made available on his GitHub a starter notebook for Julia in colab, which is probably the best way to get started.
The notebooks
The raw notebooks can be found here:
…or get kickstarted in either deepnote or colab.
Python Notebook:


Julia Notebook:


The Results

If you haven’t read my previous article on numpy vectorization I would encourage you to (obviously!), as it will help you get an idea of how the Python methods stack up before we jump into the Julia results.
All will be summarised and compared at the end of the article, so don’t worry too much if you don’t have the time.
The input data
Define a random number generator, and two columns of one million random numbers taken from a normal distribution, just like in the numpy vectorization article: