How To Use Bash To Automate The Boring Stuff For Data Science
A guide for using the command line to write some reusable code for your Data Science projects
As a Data Scientist, you tend to use some commands via the terminal over and over again. These may be commands to make a new directory for a project, start a new virtual environment and activate it, install a few standard libraries, etc.
The standard workflow that you might have for yourself when you begin your work on a project might not be that different each time.
For instance, I always create a new folder for a new project and move into it, via:
mkdir newproject
cd newprojectand then I make a new virtual environment via:
pipenv shell # orpython -m venv newenvand finally, I also do numpy and pandas as a boilerplate installation for the project. For visualisation purposes, I also use matplotlib and plotly frequently.
As a matter of fact, I also like to quickly spin off a web application for my machine leaning apps, so I also tend to install streamlit along the way too, as it’s the best library for it. If you don’t know about it much, here’s a quick introduction I wrote to get you started with it.
Thus, the commands we need to execute so far are:
pip install numpy pandas matplotlib plotly streamlitIf you forget to add a library, you have to go back and install them again via the terminal. Looking at it now, won’t it be a good idea to do all of this automatically, via just one command?
You can use a script that you can execute everytime to automatically perform a few repetitive commands to get a little ahead and save yourself some valuable time.
In this article, I will demonstrate a simple command line process that you can easily get used to for automating the boring stuff efficiently, one that I tend to use quite often.
Let’s get started! 👇
Checking for bash on the system
A simple way to get to know where bash is located on your system is to use:
$ which bashThe output will be something like:
/bin/bashCheck for the bash version:
bash --versionThe output should be like:

Great, now that we have that little bit of information, let’s see what we can build with it.
Making a new bash script
We will be making one file which will contain all our boilerplate commands to execute. This will be called as our bash script, and it will have the extension of .sh as far as common practise is concerned.
First, create a new file.
touch createmlapp.shNext, let’s add a line of code to the top of our file to make sure the system knows to use default bash shell to use for running our script.
#!/bin/bashNow, let’s understand what we want to do here. Our process will be as follows:
- Create a new directory for our project
- Create and activate virtual environment
- Install whatever packages we require
- Open up VSCode inside the project directory.
Let’s go through them now.
Writing our script
All our commands will be similar to what we run normally in the terminal.
There’s just one difference — in order to make our project, we need a name which we’ll pass through an argument.
APP_NAME="$1"The first argument entered while executing the script will be our $1.
Now, the rest of the code will the familiar:
cd
cd Desktop/ # use whatever directory you wish here
mkdir $APP_NAME
cd $APP_NAMENow enters the virtual environment:
python -m venv newenv
source newenv/bin/activateFinally, we install whatever packages we need:
pip install --upgrade pip
pip install numpy pandas matplotlib streamlitAnd lastly, as a bonus, let’s open up our favourite code editor VSCode to begin with our project.
code .And we’re done! 😄. Your script should now look like this:
#!/bin/bashAPP_NAME="$1"
cd
cd Desktop/
mkdir $APP_NAME
cd $APP_NAME
python -m venv newenv
source newenv/bin/activate
pip install --upgrade pip
pip install numpy pandas matplotlib streamlit
code .Running our script
First, we do chmod +x (on our script) to make it executable.
chmod +x createmlapp.shAwesome! This is great. The only thing left is for us to move this script into our home directory so that we can run it without cd — ing into any other folder every time.
First, find out what your home directory is — type cd into the terminal. Wherever that takes you, you need to move the script there in order to make it executable from there.
Now, simply type:
./createmlapp.sh yourmlappname
to see the script in action!
Concluding…
Congrats! After this little guide, you should now be able to automate similar workflows like these to save yourself some time when starting out with a new project!
All I can recommend you now is to explore more and experiment with creating more scripts, possibly to perform more complicated tasks as well such as running a new app, auto pushing code to GitHub, etc.
You can find the code repository here.
If you liked this article, I share little bits of helpful tools and techniques from the Data Science world every week here. Follow me to never miss them!
Finally, here are a couple similar articles of mine you might find useful too:






