How to execute shell commands properly in Python

Learn different ways to run shell commands in Python

As a scripting language, Python is a good choice for automation. You can write simple Python code to chain and automate the execution of different programs. And when the jobs are finished, you can write some Python script to extract and analyze the results. Some programs, such as Scrapy, now have APIs that Python can call directly, which means you can run these programs with native Python code directly. However, many still don’t have proper APIs for Python and you would still need to run the shell commands directly.

I think most Python developers have already used os.system() to execute shell commands before. Yes, the system() function of the os module is the simplest way to execute shell commands in Python. However, it has two major disadvantages.

Firstly, you can only get the exit code of the shell command and cannot get the standard output or error from os.system(). Well, this may not be an issue if you don’t care about the output or error message of the shell command and just want to run it. In this case, pay a little more attention to the second disadvantage, which is more important.

Secondly, os.system() is susceptible to shell injection, which is also called OS command injection. This is a serious security issue if your program accepts user input that will be used in the os.system() function. We will now demonstrate this problem with a simple example. Let’s say we have some script that accepts user input for a hostname or IP address which will be used as the input for the ping command to test the network connectivity. The user input can come from the front end or simply with the input function:

And you can see the result of the ping command:

Suppose an evil user enters some command like this:

And you will see something interesting, but also quite dangerous:

With the os.system() function, the string passed in is parsed as Linux shell commands. Especially, the semicolon ; is used to separate the commands. Therefore, in the example above, if the user entered google.com; cat /etc/passwd, two commands will be executed. The first one is ping -c 3 google.com, and the second one is cat /etc/passwd.

It is a serious security issue if the content of /etc/passwd is exposed to external users. Besides, the user can enter any Linux shell commands after the semicolon. Suppose the user enters a destructive command like rm -rf /, there can be serious damage to your system by this command.

Therefore, for best practice, avoid using os.system(), but use the subprocess module instead, which will be introduced soon. If not avoidable, don’t accept user input or clean the user input before it’s passed to os.system() so no dangerous command can be executed.

Use subprocess.run().

I think you have realized the danger of the os.system() function. As an alternative, it is recommended to use the subprocess module to execute shell commands whenever possible. It can do the same work as os.system(), and much more. And more importantly, we can avoid shell injection which makes it safer to use.

When you read some old tutorials or posts, you would see many functions from the subprocess module are used, including call(), run(), check_output, etc. However, with modern Python versions (>3.5), it’s recommended to use the run() function for all use cases.

The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle. For more advanced use cases, the underlying Popen interface can be used directly.

Most of the time, we only need to use the run() function, which uses the Popen class under the hood. For special cases when you want to run shell commands asynchronously, you can use the Popen class directly, which unlike run(), will not wait for the shell command to finish.

We will introduce the Popen() class later. For now, let’s see how to use the run() function to avoid shell injection.

The run() function has a shell flag. If this flag is True, then it works similar to the os.system() function and accepts a string as the input. Therefore, it has the same security issue as os.sytem():

We can see it has the same result and thus the same problem as os.system() as shown above.

On the other hand, if the shell flag is False, it means the shell command would be passed in as a list rather than a string. Let’s see if shell injection still happens in this case:

Note that the number 3 should also be passed in as a string, otherwise an exception will be raised. The error message would be: TypeError: expected str, bytes or os.PathLike object, not int.

Interestingly, the subprocess.run() function failed this time. If you check carefully, you can see that google.com; cat /etc/passwd is passed as a whole and not split into two shell commands. It means that google.com; cat /etc/passwdas a whole is passed as the input for the ping command, since google.com; cat /etc/passwd is not a valid hostname or IP address, the ping command would fail, as indicated in the comment.

Therefore, with the shell flag set to False, we can accept user input safely and avoid shell injection for the subprocess.run() function. As False is the default value for the shell flag, we don’t need to explicitly set it to be False. However, you should avoid setting it to be True unless you have a very good reason to do so.

Get the standard output and error.

Besides overcoming the shell injection issue, the subprocess.run() function has other useful properties. Let’s explore them a bit.

Unlike os.system() which prints the standout output (STDOUT) and standard error (STDERR) on the screen, subprocess.run() can capture the output and error and make them available in the code:

Importantly, with the stdout=subprocess.PIPE and stderr=subprocess.PIPE options, the standard out, and error are saved to the stdout and stderr properties of the result, which is of type subprocess.CompletedProcess. Besides, text=True specifies that the stdout and stderr should have raw string data, rather than bytes.

Use subprocess.Popen().

As we have noticed, the subprocess.run() function blocks until the shell command is completed. This may not be preferable if the shell command takes a long time to finish. You may want to run the shell command in the background and do something else in the meantime. For this purpose, you can use the subprocess.Popen() class directly, which is used by the subprocess.run() function under the hood.

Key points for the code above:

subprocess.Popen() has the same options as subprocess.run().
subprocess.Popen() is non-blocking, and returns immediately. The shell command is run in the background.
The result returned by subprocess.Popen() is of type subprocess.Popen. We can use the poll() method to check if the shell command has been completed. If the shell command has not been completed, then None is returned. Otherwise, the exit code of the shell command is returned. Especially, exit code 0 means the shell command is completed successfully.
The standard output and error of the shell command is returned by the communicate() method, and not by the stdout and stderr properties as with subprocess.run().

There are more settings for subprocess.Popen(). For example, you can wait for a child process running the shell command to finish with the wait() method, or you can pass a timeout to the communicate() method. It should be fairly straightforward for you to deal with these cases once you know the basics of subprocess.Popen().

In this post, we have introduced different ways to execute shell commands in Python. As a general rule, we should avoid using os.system() but use subprocess.run() instead (with shell set to False, by default). And if you need to run shell commands asynchronously in the background, you can use subprocess.Popen() instead. With both subprocess.run() and subprocess.Popen(), the standard output and error can be obtained and used in your Python code directly.

How to run Scrapy spiders in your Python program

If you enjoyed this article, consider trying out the AI service I recommend. It provides the same performance and functions to ChatGPT Plus(GPT-4) but more cost-effective, at just $6/month (Special offer for $1/month). Click here to try ZAI.chat.

Summarize

How to execute shell commands properly in Python

Learn different ways to run shell commands in Python